notes blog about

2018-01, Pan-Net

Intro

What

What for

Python libraries

HTTP message

HTTP message format (both request > and response <)

$ curl reisinge.net -v
* Rebuilt URL to: reisinge.net/
*   Trying 109.230.20.210...
* Connected to reisinge.net (109.230.20.210) port 80 (#0)
> GET / HTTP/1.1
> Host: reisinge.net
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 302 Moved Temporarily
< Server: nginx/1.12.2
< Date: Thu, 11 Jan 2018 07:51:16 GMT
< Content-Type: text/html
< Content-Length: 161
< Connection: keep-alive
< Location: http://jreisinger.github.io
<
<html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/1.12.2</center>
</body>
</html>
* Connection #0 to host reisinge.net left intact

The client can’t issue another request over the same socket until the response is finished.

For requests, the body can include parameters (for POST or PUT requests) or the contents of a file to upload. For responses, the body is the payload of the resource being requested (e.g. HTML, image data, or query results). The message body is not necessarily human readable, since it can contain images or other binary data. The body can also empty, as for GET requests or most error messages.

URL

scheme://[username:password@]hostname[:port][/path][?query][#anchor]
http://example.com:81/a/b.html?user=Alice&year=2020#p2

Methods (verbs)

GET

POST

Encoding

HTTP transfer encoding <-> content encoding

Transfer encoding (Content-Length or chunked encoding, raw or compressed)

GET / HTTP/1.1
Accept-Encoding: gzip
...

HTTP/1.1 200 OK
Content-Length: 3913
Transfer-Encoding: gzip
...

Content type - what format will be selected to represent a given resource

Content encoding - if the content type ^ is text, what encoding will be used to turn text code points into bytes

Content-Type: text/html; charset=utf-8

Authentication and cookies

Basic Auth (HTTP-mediated authentication)

TLS/SSL

Cookies

(initial request)
POST /login HTTP/1.1
...

(initial response)
HTTP/1.1 200 OK
Set-Cookie: session-id=d41d8cd98f00b204e9800998ecf8427e; Path=/
...

(all subsequent requests to the same host)
GET /login HTTP/1.1
Cookie: session-id=d41d8cd98f00b204e9800998ecf8427e
...

Keep-Alive

Status codes

Returned by a server with each response.

1xx - Informational (Hold on)

2xx - Success (Here you go)

3xx - Redirects (Go the other way)

>>> r = requests.get('http://httpbin.org/status/301', allow_redirects=False)
>>> (r.status_code, r.url, r.headers['Location'])
(301, 'http://httpbin.org/status/301', '/redirect/1')

4xx - Client errors (You messed up)

5xx - Server errors (I messed up)

Various

Minimally correct request nowadays (otherwise 404):

GET /html/rfc7230 HTTP/1.1
Host: tools.ietf.org

Caching headers

Sources