Introduction to blockchain
Blockchain is a big hit right now, with a lot of news media coverage claiming that it will create the future.
After reading some papers, blogs and official docs, the author integrated the content of these materials with his own understanding and reorganized them.
The essence of blockchain
Essentially, blockchain is a special kind of distributed database. First of all, it can store information, and any information that needs to be saved can be read and written in the blockchain.
However, unlike ordinary databases, anyone can assume that the server joins the blockchain network and becomes a node. Each node is equal, and there is no role similar to Database Administrator. If someone wants to add audit to the blockchain, it cannot be achieved, because its design goal is to prevent the emergence of a central management authority.
It is precisely because it cannot be managed that blockchain cannot be controlled. Otherwise, once the big companies and conglomerates control the management, they will control the entire platform, and other users will have to obey them.
When reading and writing data to any node, all nodes will synchronize to ensure the consistency of the blockchain, which is also the reason why the data reliability can be guaranteed after the decentralization of the blockchain.
Some concepts in blockchain
The basic principles of blockchain are not complicated to understand. First, let’s look at three basic concepts:
- Transaction: An operation to reconcile the ledger, resulting in a change in the state of the ledger, such as adding a transfer record;
- Block: records all transactions and state results that occur over a period of time, which is a consensus on the current state of the ledger;
- Chain: Concatenated by blocks in the order in which they occur, it is a log record of changes in the state of the entire ledger.
Block
Blockchain consists of blocks. These blocks can be compared to database records. Every time the data is modified, a new block is created. Because the design of blockchain is incremental, any CRUD operation on the data in the existing block is to add a new block to the existing chain.
Each block is divided into two parts:
- Block header: record the feature value of the current block
Block Body: Record the actual data of the current block
The block header also contains multiple feature values of the current block.
- The generation time of the current block
- hash of block body
- Hash of the previous block
- …
In fact, the hash is that the computer can calculate an eigenvalue of the same length for any content. The hash length of the blockchain is 256 bits. That is to say, no matter what content, the eigenvalue finally calculated is 256 bits, and it can be guaranteed that as long as the original content is different, the calculated hash value must be different.
This ensures that:
- The hash value of each block must be different, and blocks can be identified by hash
If the content of the block changes, its hash value will definitely change
The hash value of the block is calculated based on the block header, which means that all the values of the block header are concatenated into a string and the string is hashed.
Combined with the above, the block header contains a lot of content, including the content of the current block body and the hash of the previous block, that is, the content of the current block body is changed or the content of the previous block is changed., it will definitely cause a change in the current block hash.
This has great significance for blockchain. If someone modifies a block, the hash of that block changes. In order for subsequent blocks to still connect to it (because the next block contains the hash of the previous block), the person must modify all subsequent blocks in turn, otherwise the changed block will be removed from the blockchain. Due to the reasons mentioned later, the calculation of hashes is very time-consuming, and it is almost impossible to modify multiple blocks in a short period of time unless someone has mastered more than 51% of the computing power of the entire network.
Node
A node refers to a computer in a blockchain network, including mobile phones, mining machines, servers, etc. Operating a node can be an ordinary wallet user, a miner, and multiple people collaborating. For example, Bitcoin belongs to the public chain. When we run a program on our computer connected to the Internet, that computer is a node in the Bitcoin network. For public blockchains like Bitcoin, in theory, you download the complete blockchain, participate in transactions and mining, and are considered nodes.
Nodes keep one or part of their own ledgers, solve the Byzantine General problem by means of computing power or share voting, and ensure that the ledgers followed by all nodes are consistent with their own ledgers through trustless methods.
Mining
Due to the need to ensure synchronization between nodes, the addition speed of new blocks cannot be too fast. Imagine that you have just synchronized a block and are ready to generate the next block based on it, but at this time another node has generated a new block, and you have to give up halfway through the calculation and synchronize again. Because each block can only be followed by one block, you can only generate the next block after the latest block forever. So, you have no choice but to synchronize as soon as you hear the signal.
Therefore, the inventor of blockchain, Satoshi Nakamoto, deliberately made it difficult to add new blocks. His design is that on average, every 10 minutes, the entire network can generate a new block, which is only six in an hour.
This output speed was not achieved through commands, but deliberately set up massive calculations. That is to say, only through an extremely large number of calculations can the valid hash of the current block be obtained, so that the new block can be added to the blockchain. Due to the large amount of calculation, it is not fast.
Principles of Blockchain
If the blockchain system is used as a Finite-State Machine, each transaction means a state change; the generated block is the consensus of participants on the outcome of the transaction resulting in a state change.
The goal of blockchain is to achieve a distributed ledger of data records that only allows additions and not deletions. The basic structure of the underlying ledger is a linear linked list. The linked list consists of a series of “blocks” (as shown in the figure below), and the subsequent blocks record the hash value of the leading block. The legitimacy of a block (and the transactions in the block) can be quickly verified by calculating the hash value. Nodes in the network can propose adding a new block, but the block must be confirmed by a consensus mechanism.
Understanding the working process of blockchain through Bitcoin
Take the Bitcoin network as an example to see how blockchain technology is used.
First, the user initiates a transaction through the Bitcoin Client, and the message is broadcast to the Bitcoin network for confirmation. The nodes in the network will package the received transaction requests waiting for confirmation together, add the hash value of the previous block header and other information to form a block structure. Then, try to find a nonce string (random string) and put it into the block, so that the hash result of the block structure meets certain conditions (such as less than a certain value). This process of calculating the nonce string is commonly known as “mining”. Finding a nonce string requires a certain amount of computing power.
Once a node finds a nonce string that meets the conditions, the block is “legal” in format and becomes a candidate block. The node broadcasts it in the network. After other nodes receive the candidate block, they verify it and find it to be legitimate. They recognize the block as a new legal block and add it to the local blockchain structure they maintain. When most nodes have accepted the block, it means that the block is accepted by the network and the transactions included in the block are confirmed.
There are two key steps here. One is to complete the consensus on a batch of transactions (creating a legal block structure); the other is to add new blocks to the chain structure and be recognized by the network to ensure that they cannot be tampered with in the future. Of course, there will be many additional details in the implementation.
Bitcoin’s consensus mechanism based on computing power (searching for nonce strings) is called Proof of Work (PoW). This is because there is no known fast heuristic algorithm to make the hash result meet certain conditions, only brute force calculations on nonce values one by one. The more attempts (the greater the workload), the greater the probability of calculation.
By adjusting the restrictions on the hash result, the Bitcoin network controls the production of a legal block in an average of about 10 minutes. The node that calculates the block will receive a management fee for all transactions in the block and a fixed reward fee issued by the protocol (currently 12.5 bitcoins, halved every four years).
Understand
According to the author’s current understanding, the so-called mining is not to Bitcoin, but to the bookkeeping rights of the Bitcoin ledger. If the entire Bitcoin network is compared to a ledger, each block on the blockchain is a ledger. A page in the book, but not all pages will be accepted by this ledger. Only those pages that conform to its rules can be correctly inserted into the ledger. The so-called mining machine is to find such pages that conform to the rules.
It’s just that the current Bitcoin Incentive Mechanism is that whenever you successfully create a new account page, you can directly write in it to transfer a certain amount of Bitcoin to your account, and combine all the transactions you currently record that have not been recorded in the blockchain. And the recognized valid transaction records are recorded in this block body together. As long as this account page is successfully inserted and accepted by most nodes as the main branch, then these transaction records are valid.
However, the number of bitcoins is limited. When the bitcoins are sent out one day, the source of income of the mining machine is the commission of the transaction records recorded on its own page.
Three scenarios of blockchain
After the introduction of smart contracts, the blockchain has gone beyond the simple data recording function, and actually has a bit of “intelligent computing” meaning; further, it can also add rights management and high-level programming language support to the blockchain to achieve a more powerful distributed ledger system that supports more commercial scenarios.
Scenarios | Features | Smart Contracts | Consistency | Permissions | Types | Performance | Programming Languages | Representations |
---|---|---|---|---|---|---|---|---|
Digital Currency | Accounting Function | No or Weak | PoW | No | Public Chain | Lower | Simple Script | Bitcoin Network |
Distributed Application Engine | Smart Contract | Turing Complete | PoW, PoS | None | Public Chain | Restricted | Specific Language | Ethereum Network |
Distributed ledger with permissions | Business processing | Multiple languages, Turing complete | Multiple mechanisms including CFT, BFT, pluggable | Support | Consortium chain | Scalable | High-level programming language | Hyperledger |
According to the different participants, it can be divided into public (Public or Permissionless) chain, consortium (Consortium or Permissioned) chain and private (Private) chain.
Public blockchains, as the name implies, can be used and maintained by anyone, and participants are mostly anonymous. Typical blockchains such as Bitcoin and Ethereum, information is fully public.
If the permission mechanism is further introduced, two types of private chain and consortium chain can be implemented.
Private chain is managed and restricted by centralized managers, only a few internal people can use it, and the information is not public. It is generally believed that the difference with traditional centralized accounting systems is not obvious.
Consortium chains are in between. Several organizations work together (such as supply chain organizations or banking consortia) to maintain a blockchain. The use of the blockchain must be restricted access with permissions, and relevant information will be protected. Typically, such as the Hyperledger project. In terms of architecture, most existing blockchains include at least a layered structure such as network layer, consensus layer, smart contract and application layer. Consortium chain implementations will also introduce additional permission management mechanisms.
Hyperledger
Hyperledger Fabric is an open source project for enterprise customers led by IBM. Unlike public chains such as Bitcoin and Ethereum, nodes in the Hyperledger Fabric network must be authorized and authenticated to join, thus avoiding POW resource overhead, greatly improving transaction processing efficiency, and meeting the processing performance requirements of enterprise-level applications. At the same time, in order to meet the flexible and changeable application scenarios, Hyperledger Fabric adopts a highly Modularization system design concept, which integrates the permission authentication module (MSP), consensus service module (Ordering Service), endorsement module (Endorsing peers), and block submission module (committing peers), etc. are deployed separately, so that developers can replace modules according to specific business scenarios, and realize plug-in/plug-out management of modules. Therefore, Hyperledger Fabric is a development framework for private/consortium chains, and the operation of the system does not require token support.
Basic concepts
Channel:
It is a data isolation mechanism to ensure that transaction information is only visible to transaction participants, and each channel is an independent blockchain, which allows multiple users to share the same blockchain system without worrying about information leakage. Channels enable different user businesses on the upper layer to share the same blockchain system resources, mainly including network, computing, and storage resources. Essentially, channels serve the upper layer business through different blockchain ledgers, and these blockchains are uniformly deployed on peers, and transactions are sorted and packaged through ordering services. Channels are controlled by permission isolation. Members in different channels cannot access the transaction information of the other party, but can only access the transaction information of the channel to which they belong.
Chaincode:
Also known as smart contracts, asset definition and asset processing logic are encapsulated into interfaces that change the state of the ledger when called by users.
Ledger:
Blockchain ledger, storing transaction information and smart contract code.
Network:
The P2P network between transaction processing nodes is used to maintain the consistency of the blockchain ledger.
Ordering
Consensus algorithms such as Kafka and SBTF are used to sort and package all transaction information into blocks, which are sent to committing peers and written into the blockchain.
World
Display the current status of asset data. The underlying layer organizes asset information in the blockchain through LevelDB and CouchDB databases to provide an efficient data access interface.
Membership
Manage authentication information and provide authorization services for clients and peers.
Role
In Hyperledger, there are three types of roles:
Client:
Client, used to send transaction requests from end users to the blockchain network.
Peers:
Responsible for maintaining the blockchain ledger, which is divided into endoring peers and committing peers. Among them, the endorser endorses the transaction (verifies the transaction and signs it), and the committer receives the packaged block and writes it into the blockchain. Peers node is a logical concept, endorser and committer can be deployed on a physical machine at the same time.
Ordering
The transaction information is received, sorted, packaged into blocks, put into the blockchain, and finally returned to the committer peers.
Transaction process
The blockchain ledger is maintained by peer nodes, not by the ordering service cluster. Therefore, only peer nodes can find complete blockchain information, while the ordering service cluster is only responsible for sorting transactions and only retains part of the blockchain information during processing. The node in the Hyperledger Fabric system is a logical concept and is not necessarily a physical device. However, for designers of production environments, peer nodes cannot be deployed on the same machine as order nodes, while enduring peers and committing peers can be deployed on the same machine. This design is mainly for decoupling the system architecture, improving scalability, and improving security through host isolation. Endorsing peer verifies the signature of the client, and then executes the smart contract code to simulate the transaction. After the transaction processing is completed, the transaction information is signed and returned to the Client. After the client receives the signed transaction information, it sends it to the order node for sorting. After the Order node sorts and packs the transaction information into blocks, it broadcasts it to committing peers and writes it into the blockchain. (For specific transaction process, please refer to: https://www.chainnews.com/articles/074736012702.htm)
Reference article:
https://mp.weixin.qq.com/s/8W_oegxPCMr9zTtpN1h6dA
http://www.ruanyifeng.com/blog/2017/12/blockchain-tutorial.html
https://yeasy.gitbooks.io/blockchain_guide/content/02_overview/definition.html