Skip to content

从 Integrated Data Lake 下载数据

本节描述如何从 Integrated Data Lake 下载数据。

先决条件

方法的选择完全取决于需求的类型。您可以使用以下定义的方法从 Integrated Data Lake 下载数据:

  1. 生成签名 URL
  2. 交叉账户访问

生成签名 URL

您可以遵循以下步骤使用此方法:

  1. 生成签名 UR L以下载对象
    端点:
POST /generateDownloadObjectUrls
Content-Type: application/json

请求示例:

{
  "paths": [
    {
      "path": "myfolder/mysubfolder/myobject.objext"
    }
  ]
}
响应示例:

{
    "objectUrls":[
        {
            "signedUrl":"https://datalake-integ-dide2-5234525690573.s3.eu-central-1.amazonaws.com/data/ten%3Ddide2/myfolder/mysubfolder/myobject.objext?X-Amz-Security-Token=Awervzdg23452xvbxd3434ddg&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credentials=ASIATCES50453sdf&X-Amz-Signature=2e2342sfgsdfgsdgh",
            "path":"myfolder/mysubfolder/myobject.objext"
        }
    ]
}
1. 可以使用此签名 URL 从目标目录下载一个或多个对象。此 UR L的有效期为120分钟。一旦时间限制过期,您需要重新生成签名 URL。
端点:

GET https://datalake-integ-dide2-5234525690573.s3.eu-central-1.amazonaws.com/data/ten%3Ddide2/myfolder/mysubfolder/myobject.objext?X-Amz-Security-Token=Awervzdg23452xvbxd3434ddg&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credentials=ASIATCES50453sdf&X-Amz-Signature=2e2342sfgsdfgsdgh

响应示例:

This is sample text in the file being uploaded.

交叉账户访问

如果需要连续访问所需的目录,则使用此方法。考虑这样一个示例,其中您有一个 AWS 帐户,任何应用都驻留在这个帐户中,并且这个应用需要持续访问 IDL 目录。在这种情况下交叉帐户访问很有用。

要使用此方法,你可以按照以下步骤:

  1. 创建需要提供访问权限的交叉帐户。
POST /crossAccounts
Content-Type: application/json

请求示例:

{
  "name": "testCrossAccount",
  "accessorAccountId": "960568630345",
  "description": "Cross Account Access for Testing",
  "subtenantId": "204a896c-a23a-11e9-a2a3-2a2ae2dbcce4"
}

响应示例:

{
  "id": "20234sd34a23a-11e9-a2a3-2a2sdfw34ce4",
  "name": "testCrossAccount",
  "accessorAccountId": "960768132345",
  "description": "Cross Account Access for Testing",
  "timestamp": "2019-09-06T21:23:32.000Z",
  "subtenantId": "204a896c-a23a-11e9-a2a3-2a2ae2dbcce4",
  "eTag": 1
}
2. 创建交叉帐户后,执行交叉帐户访问,以在所需的前缀上提供所需的访问。

POST /crossAccounts/20234sd34a23a-11e9-a2a3-2a2sdfw34ce4/accesses
Content-Type: application/json

请求示例:

{
  "description": "Access to read from mysubfolder",
  "path": "myfolder/mysubfolder",
  "permission": "READ"
}

响应示例:

{
  "id": "781c8b90-c7b6-4b1c-993c-b51a00b35be2",
  "description": "Access to read from mysubfolder",
  "storageAccount": "dlbucketname",
  "storagePath": "data/ten=tenantname/myfolder/mysubfolder",
  "path": "myfolder/mysubfolder",
  "permission": "READ",
  "status": "ENABLED",
  "timestamp": "2019-11-04T19:19:25.866Z",
  "eTag": 1
}
3. 提供了访问之后,用户可以通过 CLI 或使用 AWS SDK 将数据下载到所需的前缀,并进行相应的访问。

按照下面的命令从 S3 bucket 下载文件:

$ aws s3 cp s3://tgsbucket/myobject.objext .

download: s3://tgsbucket/myobject.objext to ./myobject.objext

还有问题?

向社区提问


除非另行声明,该网站内容遵循MindSphere开发许可协议.


Last update: January 6, 2020