Blob, Multipart Upload

What is Blob?

A blob (Binary Large Object) represents a large object of binary type. In database management systems, binary data is stored as a collection of a single individual. A blob is usually a video, sound, or multimedia file. ** In JavaScript, an object of type Blob represents the original data source of an immutable file-like object. **

image-20200724202124460

As you can see, the myBlob object contains two properties: size and type. The’size ‘property is used to represent the size of the data (in bytes), and’type’ is a string of MIME type. Blobs do not necessarily represent data in JavaScript native format. For example, the’File ‘interface is based on’Blob’, inheriting the functionality of blob and extending it to support files on the user’s system.

Blob

A blob consists of an optional string type (usually a MIME type) and blobParts:

MIME(Multipurpose

Common

The relevant parameters are described as follows:

  • blobParts: It is an array of ArrayBuffer, ArrayBufferView, Blob, DOMString, etc. DOMStrings will be encoded as UTF-8.

  • options: An optional object with the following two properties:

    • type - The default value is’ “” ', which represents the MIME type of the array content that will be placed into the blob.
    • endings - The default value is’ “transparent” ‘, which is used to specify how strings containing line terminators’\ n ‘are written. It is one of the following two values:’ “native” ‘, which means that the line terminator will be changed to a newline for the host operating system file system, or’ “transparent” ', which means that the terminator saved in the blob will remain unchanged.

Property

We already know that Blob objects contain two properties.

  • size (read-only): Represents the size, in bytes, of the data contained in the Blob object.
  • type (read-only): A string indicating the MIME type of the data contained in the’Blob 'object. If the type is unknown, the value is an empty string.

Method

  • slice ([start [, end [, contentType]]]): Returns a new blob object containing the data in the specified range in the source blob object.
  • stream (): Returns a ReadableStream that reads the contents of the blob.
  • text (): Returns a Promise object containing all the contents of the blob as a’USVString 'in UTF-8 format.
  • arrayBuffer (): Returns a Promise object containing all the contents of the blob in binary format ArrayBuffer.

Here we need to note that ** ‘Blob’ objects are immutable **. We cannot directly change data in a blob, but we can split a blob, create new blob objects from it, mix them into a new blob. This behavior is similar to JavaScript strings: we cannot change the characters in the string, but we can create new corrected strings.

Large Multipart Upload (Vue)

Client section

Upload slice

First, implement the upload function. Uploading requires two things

  • Slicing files
  • Transfer slices to server level

The File here actually inherits the Blob object.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
<template>
<div>
<input type="file" @change="handleFileChange" />
<el-button @click="handleUpload">上传</el-button>
</div>
</template>

<script>
Const SIZE = 10 * 1024 * 1024;//slice size

export default {
data: () => ({
container: {
file: null
},
data: []
}),
methods: {
request() {},
handleFileChange() {},
//Generate file slices
createFileChunk(file, size = SIZE) {
const fileChunkList = [];
let cur = 0;
while (cur < file.size) {
fileChunkList.push({ file: file.slice(cur, cur size) });
cur = size;
}
return fileChunkList;
},
//upload slice
async uploadChunks() {
const requestList = this.data
.map(({ chunk,hash }) => {
const formData = new FormData ();
formData.append("chunk", chunk);
formData.append("hash", hash);
formData.append("filename", this.container.file.name);
return { formData };
})
.map(async ({ formData }) =>
this.request({
url: "http://localhost:3000",
data: formData
})
);
Await Promise.all (requestList);//concurrent slice
},
async handleUpload() {
if (!this.container.file) return;
const fileChunkList = this.createFileChunk(this.container.file);
this.data = fileChunkList.map(({ file },index) => ({
chunk: file,
Hash: this.container.file.name "-" index//filename, array index
}));
await this.uploadChunks();
}
}
};
</script>

When clicking the upload button, call’createFileChunk 'to slice the file. The number of slices is controlled by the file size. Set 10MB here, that is to say, a 100 MB file will be divided into 10 slices

Use the while loop and slice method inside createFileChunk to put the slice into the’fileChunkList 'array

When generating file slices, you need to give each slice an identifier as a hash. Here, temporarily use’filename, subscript ', so that the backend can know which slice the current slice is, which is used for subsequent merged slices

Then call’uploadChunks’ to upload all the file slices, put the file slice, slice hash, and file name into FormData, then call the previous’request 'function to return a proimise, and finally call Promise.all to upload all the slices concurrently

Send Merge Request

The second way of merging slices mentioned in the overall idea is used here, that is, the front end actively informs the server level to merge, so the front end needs to send an additional request, and the server level actively merges slices when receiving this request

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
<template>
<div>
<input type="file" @change="handleFileChange" />
<el-button @click="handleUpload">上传</el-button>
</div>
</template>

<script>
export default {
data: () => ({
container: {
file: null
},
data: []
}),
methods: {
request() {},
handleFileChange() {},
createFileChunk() {},
//Upload slices while filtering uploaded slices
async uploadChunks() {
const requestList = this.data
.map(({ chunk,hash }) => {
const formData = new FormData ();
formData.append("chunk", chunk);
formData.append("hash", hash);
formData.append("filename", this.container.file.name);
return { formData };
})
.map(async ({ formData }) =>
this.request({
url: "http://localhost:3000",
data: formData
})
);
await Promise.all(requestList);
//Merge slices
await this.mergeRequest();
},
async mergeRequest() {
await this.request({
url: "http://localhost:3000/merge",
headers: {
"content-type": "application/json"
},
data: JSON.stringify({
filename: this.container.file.name
})
});
},
async handleUpload() {}
}
};

Server level part

Simply use http module to build server level

1
2
3
4
5
6
7
8
9
10
11
12
13
14
const http = require("http");
const server = http.createServer();

server.on("request", async (req, res) => {
res.setHeader("Access-Control-Allow-Origin", "*");
res.setHeader("Access-Control-Allow-Headers", "*");
if (req.method = "OPTIONS") {
res.status = 200;
res.end();
return;
}
});

Server.listen (3000 , () => console.log ("listening on port 3000"));

Accept slices

Use the’multiparty 'package to process FormData from the frontend

In the callback of multiparty.parse, the files parameter saves the files in FormData, and the fields parameter saves the fields of non-files in FormData

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
const http = require("http");
const path = require("path");
const fse = require("fs-extra");
const multiparty = require("multiparty");

const server = http.createServer();
Const UPLOAD_DIR = path.resolve (__dirname, "..", "target "); // large file storage directory

server.on("request", async (req, res) => {
res.setHeader("Access-Control-Allow-Origin", "*");
res.setHeader("Access-Control-Allow-Headers", "*");
if (req.method = "OPTIONS") {
res.status = 200;
res.end();
return;
}

const multipart = new multiparty.Form();

multipart.parse(req, async (err, fields, files) => {
if (err) {
return;
}
const [chunk] = files.chunk;
const [hash] = fields.hash;
const [filename] = fields.filename;
const chunkDir = path.resolve(UPLOAD_DIR, filename);

//slice directory does not exist, create slice directory
if (!fse.existsSync(chunkDir)) {
await fse.mkdirs(chunkDir);
}

//fs-extra dedicated method, similar to fs.rename and cross-platform
//fs-extra rename method will have permission issues on windows platform
// https://github.com/meteor/meteor/issues/7852#issuecomment-255767835
await fse.move(chunk.path, `${chunkDir}/${hash}`);
res.end("received file chunk");
});
});

Server.listen (3000 , () => console.log ("listening on port 3000"));

Look at the chunk object processed by multiparty, path is the path to store the temporary file, size is the temporary file size, it is mentioned in the multiparty doc that fs.rename can be used (because I use fs-extra, its rename method windows platform permission problem, so it was replaced by fse.move) to move the temporary file, that is, move the file slice

When accepting file slices, you need to create a folder to store the slices first. Since the front end additionally carries a unique value hash when sending each slice, use hash as the file name to move the slice from the temporary path to the slice folder.

Merge slices

After receiving the Merge Request sent by the frontend, the server level merges all the slices under the folder

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
const http = require("http");
const path = require("path");
const fse = require("fs-extra");

const server = http.createServer();
Const UPLOAD_DIR = path.resolve (__dirname, "..", "target "); // large file storage directory

const resolvePost = req =>
new Promise(resolve => {
let chunk = "";
req.on("data", data => {
chunk = data;
});
req.on("end", () => {
resolve(JSON.parse(chunk));
});
});

const pipeStream = (path, writeStream) =>
new Promise(resolve => {
const readStream = fse.createReadStream(path);
readStream.on("end", () => {
fse.unlinkSync(path);
resolve();
});
readStream.pipe(writeStream);
});

//Merge slices
const mergeFileChunk = async (filePath, filename, size) => {
const chunkDir = path.resolve(UPLOAD_DIR, filename);
const chunkPaths = await fse.readdir(chunkDir);
//sort by slice index
//Otherwise, the order obtained by directly reading the directory may be disordered
chunkPaths.sort((a, b) => a.split("-")[1] - b.split("-")[1]);
await Promise.all(
chunkPaths.map((chunkPath, index) =>
pipeStream(
path.resolve(chunkDir, chunkPath),
Specify the location to create a writable stream
fse.createWriteStream(filePath, {
start: index * size,
end: (index 1) * size
})
)
)
);
Fse.rmdirSync (chunkDir);//delete the directory where the slice is saved after merging
};

server.on("request", async (req, res) => {
res.setHeader("Access-Control-Allow-Origin", "*");
res.setHeader("Access-Control-Allow-Headers", "*");
if (req.method = "OPTIONS") {
res.status = 200;
res.end();
return;
}

if (req.url = "/merge") {
const data = await resolvePost(req);
const { filename,size } = data;
const filePath = path.resolve(UPLOAD_DIR, `${filename}`);
await mergeFileChunk(filePath, filename);
res.end(
JSON.stringify({
code: 0,
message: "file merged success"
})
);
}

});

Server.listen (3000 , () => console.log ("listening on port 3000"));

Since the front end will carry the file name when sending the Merge Request, the server level can find the slice folder created in the previous step according to the file name

Then use fs.createWriteStream to create a writable stream. The writable stream file name is the slice folder name, and the suffix name is combined

Then traverse the entire slice folder, create a readable stream of the slice through fs.createReadStream, and merge the transfer into the target file

It is worth noting that each readable stream will be transmitted to the specified position of the writable stream, which is controlled by the second parameter start/end of createWriteStream, in order to be able to concurrently merge multiple readable streams into the writable stream, so that even if the order of the stream is different, it can be transmitted to the correct position, so here we also need to let the front end provide an additional size parameter when requesting

1
2
3
4
5
6
7
8
9
10
11
12
async mergeRequest() {
await this.request({
url: "http://localhost:3000/merge",
headers: {
"content-type": "application/json"
},
data: JSON.stringify({
size: SIZE,
filename: this.container.file.name
})
});
},

Other usage scenarios

We can download data from the internet and store it in a blob object using the following methods, for example:

1
2
3
4
5
6
7
8
9
const downloadBlob = (url, callback) => {
const xhr = new XMLHttpRequest()
xhr.open('GET', url)
xhr.responseType = 'blob'
xhr.onload = () => {
callback(xhr.response)
}
xhr.send(null)
}

Of course, in addition to using the’XMLHttpRequest ‘API, we can also use the’fetch’ API to obtain binary data in a streaming manner. Here we take a look at how to use the fetch API to obtain online images and display them locally. The specific implementation is as follows:

1
2
3
4
5
6
7
8
9
10
11
const myImage = document.querySelector('img');
const myRequest = new Request('flowers.jpg');

fetch(myRequest)
.then(function(response) {
return response.blob();
})
.then(function(myBlob) {
let objectURL = URL.createObjectURL(myBlob);
myImage.src = objectURL;
});

When the fetch request succeeds, we call the’blob () 'method of the response object, read a blob object from the response object, then use the’createObjectURL () ’ method to create an objectURL, and assign it to the’src ‘attribute of the’img’ element to display the image.

Blob

Blob can easily be used as a URL for < a >, < img >, or other tags. Thanks to the type attribute, we can also upload/download blob objects. Below we will give an example of blob file download, but before looking at the specific example, we need to briefly introduce blob URLs.

1.Blob URL/Object URL

Blob URL/Object URL is a pseudo-protocol that allows Blob and File objects to be used as URL sources for images, download binary data links, etc. In browsers, we create Blob URLs using the URL.createObjectURL method, which takes a Blob object and creates a unique URL for it in the form of blob: < origin >/< uuid >. The corresponding example is as follows:

1
blob:https://example.org/40a5fb5a-d56d-4a33-b4e2-0acf6a8e5f641

The browser internally stores a URL → Blob mapping for each URL generated through URL.createObjectURL. Therefore, such URLs are shorter, but’Blob ‘can be accessed. The generated URL is only valid in the current doc open state. It allows referencing’Blob’ in ‘< img >’, ‘< a >’, but if the Blob URL you access no longer exists, you will receive a 404 error from the browser.

The above blob URL seems pretty good, but in fact it also has side effects. Although the mapping of URL → blob is stored, the blob itself still resides in memory and the browser cannot release it. The mapping is automatically cleared when the doc is uninstalled, so the blob object is then released.

However, if the application has a long lifespan, that won’t happen anytime soon. Therefore, if we create a blob URL, it will still exist in memory even if the blob is no longer needed.

To solve this problem, we can call the URL.revokeObjectURL (url) method to remove the reference from the internal mapping, allowing the blob to be deleted (if there are no other references) and freeing up memory. Next, let’s look at a specific example of blob file download.

** 2. Blob file download example **

index.html

1
2
3
4
5
6
7
8
9
10
11
12
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8" />
< title > Blob file download example </title >
</head>

<body>
< button id = "downloadBtn" > File download </button >
<script src="index.js"></script>js
</body>
</html>

index.js

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
const download = (fileName, blob) => {
const link = document.createElement("a");
link.href = URL.createObjectURL(blob);
link.download = fileName;
link.click();
link.remove();
URL.revokeObjectURL(link.href);
};

const downloadBtn = document.querySelector("#downloadBtn");
downloadBtn.addEventListener("click", (event) => {
const fileName = "blob.txt";
const myBlob = new Blob(["一文彻底掌握 Blob Web API"], { type: "text/plain" });
download(fileName, myBlob);
});

In the example, we create a Blob object of type “text/plain” by calling the Blob constructor function, and then download the file by dynamically creating the “a” tag.

More usage

More usages can be referred to. 你不知道的Blob

Reference link:

你不知道的Blob

大文件分片上传