Parameter analysis and application of http protocol header
I encountered several problems in my work this week, all of which are errors reported in the header of the http protocol. They can be roughly divided into three categories, Referrer Policy, Breakpoint Resume, and Range Request. Here I have searched and sorted out a little content and applications.
Referrer
Referer
The first thing to do is to understand what a Referer is.
Simply put, when you initiate an http request, the’referrer 'field in the request header indicates which page you initiated the request from. For a detailed explanation, please see Mr. Ruan Yifeng’sHttp Referer 教程
Referer
Referrer-Policy, which controls the content of referrer in the request header, is currently a candidate standard, although some browsers already support it.
Currently’Referrer-Policy 'only contains the following values:
1 | enum ReferrerPolicy { |
Empty string
If set to an empty string, the content of’referrer ‘is set according to the browser’s mechanism by default, which is the same as’no-referrer-when-downgrade’ by default.
no-referrer
Do not display any information about’referrer 'in the request header.
no-referrer-when-downgrade
This is the default value. When jumping from https website to http website or requesting its resources (security downgrade HTTPS → HTTP), the information of’referrer ‘is not displayed. In other cases (security sibling HTTPS → HTTPS, or HTTP → HTTP), the full URL information of the source website is displayed in’referrer’.
same-origin
Indicates that the browser will only display’referrer 'information to websites with the same origin, and it is the complete URL information. The so-called homologous website is a website with the same protocol, domain name, and port.
origin
Indicates that the browser only displays the source address (i.e. protocol, domain name, port) of the source website in the referrer field, not the full path.
strict-origin
This policy is more secure, similar to the’origin ‘policy, but does not allow’referrer’ information to be displayed in requests from https websites to http websites (security degradation).
origin-when-cross-origin
When sending a request to a same-origin website, the Browser will display the complete URL information in the referrer, when sending a Non-Same-Origin website, only the source address (protocol, domain name, port) will be displayed.
strict-origin-when-cross-origin
Similar to’origin-when-cross-origin ‘, except that’referrer’ information is not allowed to appear in requests from https websites to http websites (security degradation).
unsaft-url
The browser will always display the complete URL information in the’referrer 'field, regardless of the request sent to any website
Referrer-Policy change method
There can be the following 5 methods:
** 1. ** Set via Referrer-Policy HTTP header:
1 | Referrer-Policy: origin copy code |
** 2. ** Change the’Referrer Policy ‘via the’element’ to directly modify the content named’referrer ’
1 | < Meta name = "referrer" content = "origin" > copy code |
** 3. ** Here ](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/a), [
, ![img]()
, ](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe), or[
The element sets the’referrerpolicy 'attribute
1 | < a href = "http://example.com" referrerpolicy = "origin" > copy code |
** 4. ** If you need to set not to display’referrer 'information, you can also give ](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/a), [
, ``The element sets the link relationship of’rel '.
1 | <a href="http://example.com" rel="noreferrer"> |
Range request
Range requests are mainly requests or uploads for larger files, and you can only operate on a certain segment of it.
A more common scenario is to resume uploading/downloading when the network is not good, you can continue to obtain only part of the content after disconnecting. For example, downloading software online has already downloaded 95%. At this time, the network is disconnected. If range requests are not supported, you will be forced to start downloading all over again. But if there is the blessing of range requests, you only need to download the last 5% of the resources to avoid re-downloading.
Another scenario is multi-threaded download. For large files, multiple threads are opened, and each thread downloads a certain section of them. After the final download is completed, it is spliced locally into a complete file, which can make more efficient use of resources.
These are two common scenarios. Next, let’s take a look at the technical details of HTTP protocol support for range requests.
HTTP
Whether to support scope requests
HTTP itself is a stateless “loose” protocol, and after many iterations, it only supports range requests on top of HTTP/1.1 (RFC2616). So if either client or server level is lower than HTTP/1.1, we should not use the function of range requests.
In HTTP/1.1, a response header’Access-Ranges’ is explicitly declared to mark whether range requests are supported, with only one optional parameter’bytes’.
For example, here is an MP4 response header, you can see that it is marked with’Accept-Ranges: bytes’, which identifies the current resource support range request.
Usage scope request
If it has been determined that both sides support range requests, we can use it when requesting resources.
All files are ultimately bytes stored on disk or in memory, and the files to be operated on can be divided in bytes. In this way, only HTTP support is required to request resources in the range of n to n + x for the file to achieve the range request.
HTTP/1.1 defines a Ranges header to specify the range of the request entity. Its range is between ‘0 - Content-Length’, separated by ‘-’…
For example, the resource content of 1000 bytes has been downloaded. If you want to continue downloading the resource content after that, just add’Ranges: bytes = 1000- 'to the header of the HTTP request.
Range also has several different ways to limit the range, which can be flexibly customized according to your needs.
500-1000: Specifies the range of start and end, generally used for multi-threaded downloads.
- 500-: Specify the start interval, which is passed to the end. This is more suitable for resuming transmission at breakpoints, or online playback, etc.
- -500: No start interval, only means the content entity that needs the last 500 bytes.
- 100-300, 1000-3000: Specify multiple ranges. There are few scenarios used in this way, just learn about it.
The HTTP protocol is a bilateral negotiation protocol. Since the request header has been determined to use Ranges, it is also necessary to use the’Content-Ragne 'response header to mark the physical content range of the response.
The format of’Content-Range 'is also very clear. First, mark its unit as bytes, and then mark the currently passed content entity range and total length.
1 | Content-Range: bytes 100-999/1000 |
In this example, content entities in the range of 100~ 999 are passed, and the total size of the resource file is 1000 bytes. And the HTTP response status code at this time is’ 206 Partial Content '.
Resource changes
When we download large-scale resources in some download tools, we occasionally pause and re-download in the middle, and we may encounter the situation that it starts downloading again.
This may seem to be the failure of the HTTP scope request, but in fact it is not necessarily the case. It is likely that the requested resource has changed during the process of the request.
If during the download process, the downloaded source resource file has changed, but the URL has not changed, the file length may have changed at this time (which is very easy to find). In extreme cases, even if there is no length change, you will continue to Download, it is likely that after the final download is completed, the downloaded content cannot be spliced into the file we need.
If we need to download a resource from the server, we must prevent possible changes to this resource. Before talking about HTTP 缓存的时候讲到,在 HTTP 协议中,可以通过 ETag 或者 Last-Modified 来标识当前资源是否变化。
- ETag: A verification token fingerprint of the current file, used to identify the uniqueness of the file.
- Last-Modified: marks the time when the current file was last modified.
In the HTTP range request, you can also use these two fields to distinguish the resource of the segmented request. Whether it has been modified or not, you only need to put it in the request header’If-Range ‘in the request message network packet header.’ If-Range ‘uses either of the’ETag’ or’Last-Modified 'parameters, and you can fill it in as it is.
References:
https://juejin.im/post/5cd81b59518825686a06fd05
https://developer.mozilla.org/zh-CN/docs/Web/HTTP/Headers/Referrer-Policy