notes blog about

2018-01, Pan-Net



What for

Python libraries

HTTP message

HTTP message format (both request > and response <)

$ curl -v
* Rebuilt URL to:
*   Trying
* Connected to ( port 80 (#0)
> GET / HTTP/1.1
> Host:
> User-Agent: curl/7.47.0
> Accept: */*
< HTTP/1.1 302 Moved Temporarily
< Server: nginx/1.12.2
< Date: Thu, 11 Jan 2018 07:51:16 GMT
< Content-Type: text/html
< Content-Length: 161
< Connection: keep-alive
< Location:
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
* Connection #0 to host left intact

The client can’t issue another request over the same socket until the response is finished.

For requests, the body can include parameters (for POST or PUT requests) or the contents of a file to upload. For responses, the body is the payload of the resource being requested (e.g. HTML, image data, or query results). The message body is not necessarily human readable, since it can contain images or other binary data. The body can also empty, as for GET requests or most error messages.



Methods (verbs)




HTTP transfer encoding <-> content encoding

Transfer encoding (Content-Length or chunked encoding, raw or compressed)

GET / HTTP/1.1
Accept-Encoding: gzip

HTTP/1.1 200 OK
Content-Length: 3913
Transfer-Encoding: gzip

Content type - what format will be selected to represent a given resource

Content encoding - if the content type ^ is text, what encoding will be used to turn text code points into bytes

Content-Type: text/html; charset=utf-8

Authentication and cookies

Basic Auth (HTTP-mediated authentication)



(initial request)
POST /login HTTP/1.1

(initial response)
HTTP/1.1 200 OK
Set-Cookie: session-id=d41d8cd98f00b204e9800998ecf8427e; Path=/

(all subsequent requests to the same host)
GET /login HTTP/1.1
Cookie: session-id=d41d8cd98f00b204e9800998ecf8427e


Status codes

Returned by a server with each response.

1xx - Informational (Hold on)

2xx - Success (Here you go)

3xx - Redirects (Go the other way)

>>> r = requests.get('', allow_redirects=False)
>>> (r.status_code, r.url, r.headers['Location'])
(301, '', '/redirect/1')

4xx - Client errors (You messed up)

5xx - Server errors (I messed up)


Minimally correct request nowadays (otherwise 404):

GET /html/rfc7230 HTTP/1.1

Caching headers