私有数据

注解

本主题假设您已经理解`“关于私有数据的文档 <private-data/private-data.html>`_ 中的概念。

私有数据集合定义

集合定义包含一个或多个集合,每个集合都有一个用来定义列出集合中的组织的策略,以及用于在背书时控制私有数据传播的属性,还有一个可选的操作,决定是否清除数据。

从Fabric v2.0 Alpha版引入的Fabric链码生命周期开始,集合定义是链码定义的一部分。集合由通道成员批准,然后当链码的定义被提交到通道时会部署集合。对所有通道成员来说,集合文件都是相同的。如果您使用节点CLI来批准和提交链码的定义,那么使用 --collections-config 参数来指定集合的定义文件所在的路径。如果您正在使用Node版的Fabric SDK,请访问 如何安装和启动链代码。使用 previous lifecycle process 来部署私有数据集合时,需要在 实例化你的链码 阶段使用 --collections-config 参数。

集合定义由以下属性组成:

  • name: 集合名称。

  • policy:私有数据集合分发策略定义了哪些组织的peer被允许使用``Signature``策略语法表示持久化集合数据,每个成员都包含在`OR`` 签名策略列表中。为了支持读/写交易,私有数据分发策略必须定义比链码背书策略更广泛的组织集合,因为peer必须拥有私有数据才能背书提议的交易。例如,在一个有十个组织的通道中,五个组织可能包括在一个私有数据集合分发策略中,但是背书策略可能要求任意三个组织背书。

  • requiredPeerCount: 在背书节点签署背书并将提案响应返回给客户端之前,每个背书节点必须成功地向其传播私有数据的最小节点数量(跨授权组织)。要求先传播才能背书的条件将确保即使背书的节点变的无法使用了,也能从网络中获得私有数据。当``requiredPeerCount`` 为``0``时,意味着分布存储不是必须的**required**,但是如果``maxPeerCount``大于0,则可能存在一些分布存储。通常不建议将 requiredPeerCount``设为``0,因为如果背书节点变得不可用,则可能导致网络中的私有数据丢失。通常,您可能希望在背书时至少需要一些私有数据的分布存储,以确保网络中多个节点上的私有数据有冗余。

  • maxPeerCount:出于数据冗余的目的,每个背书节点尝试将私有数据分发给的其他节点(跨授权组织)的最大节点数量。如果在背书时间和提交时间之间某个背书节点不可用,那么在背书时间还没接收到私有数据到集合成员节点将能够从已接收到私有数据的节点中拉取私有数据。如果将此值设置为``0``,则不会在背书时传播私有数据,从而迫使已被授权获取私有数据的节点在提交时从背书节点拉取私有数据。

  • blockToLive:这个属性表示以块的形式存储在私有数据库上的数据应该存在多长时间。数据将在私有数据库上保留本字段指定的数量的块,超出这个数量长度的将被清除,使该数据在网络中过期,目的是不能从链码查询它,从节点也请求不到。如果要无限期地保留私有数据,即永远不清除私有数据,请将 blockToLive 属性设置为 0

  • memberOnlyRead: 这个值为``true``表示节点自动强制要求只允许属于集合成员之一的组织的客户端对私有数据进行读访问。如果来自非成员组织的客户端试图执行一个链码函数,该函数执行对私有数据的读取功能,那么这个链码调用将以一个错误的形式终止。如果希望在每个链码函数中编码更细粒度的访问控制,请使用``false``值。

下边是一个定义集合的例子的JSON 文件,包含一个数组,数组内容为两个集合的定义:

[
 {
    "name": "collectionMarbles",
    "policy": "OR('Org1MSP.member', 'Org2MSP.member')",
    "requiredPeerCount": 0,
    "maxPeerCount": 3,
    "blockToLive":1000000,
    "memberOnlyRead": true
 },
 {
    "name": "collectionMarblePrivateDetails",
    "policy": "OR('Org1MSP.member')",
    "requiredPeerCount": 0,
    "maxPeerCount": 3,
    "blockToLive":3,
    "memberOnlyRead": true
 }
]

This example uses the organizations from the BYFN sample network, Org1 and Org2 . The policy in the collectionMarbles definition authorizes both organizations to the private data. This is a typical configuration when the chaincode data needs to remain private from the ordering service nodes. However, the policy in the collectionMarblePrivateDetails definition restricts access to a subset of organizations in the channel (in this case Org1 ). In a real scenario, there would be many organizations in the channel, with two or more organizations in each collection sharing private data between them.

Private data dissemination

Since private data is not included in the transactions that get submitted to the ordering service, and therefore not included in the blocks that get distributed to all peers in a channel, the endorsing peer plays an important role in disseminating private data to other peers of authorized organizations. This ensures the availability of private data in the channel’s collection, even if endorsing peers become unavailable after their endorsement. To assist with this dissemination, the maxPeerCount and requiredPeerCount properties in the collection definition control the degree of dissemination at endorsement time.

If the endorsing peer cannot successfully disseminate the private data to at least the requiredPeerCount, it will return an error back to the client. The endorsing peer will attempt to disseminate the private data to peers of different organizations, in an effort to ensure that each authorized organization has a copy of the private data. Since transactions are not committed at chaincode execution time, the endorsing peer and recipient peers store a copy of the private data in a local transient store alongside their blockchain until the transaction is committed.

When authorized peers do not have a copy of the private data in their transient data store at commit time (either because they were not an endorsing peer or because they did not receive the private data via dissemination at endorsement time), they will attempt to pull the private data from another authorized peer, for a configurable amount of time based on the peer property peer.gossip.pvtData.pullRetryThreshold in the peer configuration core.yaml file.

注解

The peers being asked for private data will only return the private data if the requesting peer is a member of the collection as defined by the private data dissemination policy.

Considerations when using pullRetryThreshold:

  • If the requesting peer is able to retrieve the private data within the pullRetryThreshold, it will commit the transaction to its ledger (including the private data hash), and store the private data in its state database, logically separated from other channel state data.

  • If the requesting peer is not able to retrieve the private data within the pullRetryThreshold, it will commit the transaction to it’s blockchain (including the private data hash), without the private data.

  • If the peer was entitled to the private data but it is missing, then that peer will not be able to endorse future transactions that reference the missing private data - a chaincode query for a key that is missing will be detected (based on the presence of the key’s hash in the state database), and the chaincode will receive an error.

Therefore, it is important to set the requiredPeerCount and maxPeerCount properties large enough to ensure the availability of private data in your channel. For example, if each of the endorsing peers become unavailable before the transaction commits, the requiredPeerCount and maxPeerCount properties will have ensured the private data is available on other peers.

注解

For collections to work, it is important to have cross organizational gossip configured correctly. Refer to our documentation on Gossip data dissemination protocol, paying particular attention to the “anchor peers” and “external endpoint” configuration.

Referencing collections from chaincode

A set of shim APIs are available for setting and retrieving private data.

The same chaincode data operations can be applied to channel state data and private data, but in the case of private data, a collection name is specified along with the data in the chaincode APIs, for example PutPrivateData(collection,key,value) and GetPrivateData(collection,key).

A single chaincode can reference multiple collections.

How to pass private data in a chaincode proposal

Since the chaincode proposal gets stored on the blockchain, it is also important not to include private data in the main part of the chaincode proposal. A special field in the chaincode proposal called the transient field can be used to pass private data from the client (or data that chaincode will use to generate private data), to chaincode invocation on the peer. The chaincode can retrieve the transient field by calling the GetTransient() API. This transient field gets excluded from the channel transaction.

Access control for private data

Until version 1.3, access control to private data based on collection membership was enforced for peers only. Access control based on the organization of the chaincode proposal submitter was required to be encoded in chaincode logic. Starting in v1.4 a collection configuration option memberOnlyRead can automatically enforce access control based on the organization of the chaincode proposal submitter. For more information about collection configuration definitions and how to set them, refer back to the Private data collection definition section of this topic.

注解

If you would like more granular access control, you can set memberOnlyRead to false. You can then apply your own access control logic in chaincode, for example by calling the GetCreator() chaincode API or using the client identity chaincode library .

Querying Private Data

Private data collection can be queried just like normal channel data, using shim APIs:

  • GetPrivateDataByRange(collection, startKey, endKey string)

  • GetPrivateDataByPartialCompositeKey(collection, objectType string, keys []string)

And for the CouchDB state database, JSON content queries can be passed using the shim API:

  • GetPrivateDataQueryResult(collection, query string)

Limitations:

  • Clients that call chaincode that executes range or rich JSON queries should be aware that they may receive a subset of the result set, if the peer they query has missing private data, based on the explanation in Private Data Dissemination section above. Clients can query multiple peers and compare the results to determine if a peer may be missing some of the result set.

  • Chaincode that executes range or rich JSON queries and updates data in a single transaction is not supported, as the query results cannot be validated on the peers that don’t have access to the private data, or on peers that are missing the private data that they have access to. If a chaincode invocation both queries and updates private data, the proposal request will return an error. If your application can tolerate result set changes between chaincode execution and validation/commit time, then you could call one chaincode function to perform the query, and then call a second chaincode function to make the updates. Note that calls to GetPrivateData() to retrieve individual keys can be made in the same transaction as PutPrivateData() calls, since all peers can validate key reads based on the hashed key version.

Using Indexes with collections

注解

The Fabric chaincode lifecycle being introduced in the Fabric v2.0 Alpha does not support using couchDB indexes with your chaincode. To use the previous lifecycle model to deploy couchDB indexes with private data collections, visit the v1.4 version of the Private Data Architecture Guide.

The topic CouchDB as the State Database describes indexes that can be applied to the channel’s state database to enable JSON content queries, by packaging indexes in a META-INF/statedb/couchdb/indexes directory at chaincode installation time. Similarly, indexes can also be applied to private data collections, by packaging indexes in a META-INF/statedb/couchdb/collections/<collection_name>/indexes directory. An example index is available here.

Considerations when using private data

Private data purging

Private data can be periodically purged from peers. For more details, see the blockToLive collection definition property above.

Additionally, recall that prior to commit, peers store private data in a local transient data store. This data automatically gets purged when the transaction commits. But if a transaction was never submitted to the channel and therefore never committed, the private data would remain in each peer’s transient store. This data is purged from the transient store after a configurable number blocks by using the peer’s peer.gossip.pvtData.transientstoreMaxBlockRetention property in the peer core.yaml file.

Updating a collection definition

To update a collection definition or add a new collection, you can upgrade the chaincode to a new version and pass the new collection configuration in the chaincode upgrade transaction, for example using the --collections-config flag if using the CLI. If a collection configuration is specified during the chaincode upgrade, a definition for each of the existing collections must be included.

When upgrading a chaincode, you can add new private data collections, and update existing private data collections, for example to add new members to an existing collection or change one of the collection definition properties. Note that you cannot update the collection name or the blockToLive property, since a consistent blockToLive is required regardless of a peer’s block height.

Collection updates becomes effective when a peer commits the block that contains the chaincode upgrade transaction. Note that collections cannot be deleted, as there may be prior private data hashes on the channel’s blockchain that cannot be removed.

Private data reconciliation

Starting in v1.4, peers of organizations that are added to an existing collection will automatically fetch private data that was committed to the collection before they joined the collection.

This private data “reconciliation” also applies to peers that were entitled to receive private data but did not yet receive it — because of a network failure, for example — by keeping track of private data that was “missing” at the time of block commit.

Private data reconciliation occurs periodically based on the peer.gossip.pvtData.reconciliationEnabled and peer.gossip.pvtData.reconcileSleepInterval properties in core.yaml. The peer will periodically attempt to fetch the private data from other collection member peers that are expected to have it.

Note that this private data reconciliation feature only works on peers running v1.4 or later of Fabric.