Highest Voted 'aws-data-pipeline' Questions

1

vote

1 answer

Processing parameters passed to SQL activity in AWS data pipeline

I am working with AWS data pipeline. In this context, I am passing several parameters from pipeline definition to sql file as follows: s3://reporting/preprocess.sql,-d,RUN_DATE=#{@scheduledStartTime.format('YYYYMMdd')}" My sql file looks like…

asked Jan 22 '19 at 10:40

Joy

3,151
10
37
79

1

vote

0 answers

How to run multiple steps in aws data pipeline using aws console

I have a use case of scheduling my spark jobs on EMR. Every time we will be spinning a new cluster and running spark job. I went through documentation provided by aws but those are not extensive enough to give clear picture of how to do it. If any…

amazon-emr aws-data-pipeline

asked Oct 31 '18 at 06:38

Raghav salotra

716
1
8
23

1

vote

1 answer

Unresolved resource dependencies [DefaultSchedule] in the Resources block of the template

I am working with the cloudformation script to create AWS Data Pipeline. I have created the script according to the documentation but I am facing 1 error i.e. Template validation error: Template format error: Unresolved resource dependencies…

amazon-web-services amazon-cloudformation aws-data-pipeline

asked Aug 08 '18 at 17:34

Achal

87
10

0

votes

2 answers

Is it possible to update and insert data in AWS Glue database using glue

So I am using AWS pyspark, and have gigabytes of data everyday, which is getting updated. I want to find the id of the data in an existing table in glue database, update if the id already exists and insert if the id does not exist. Is it possible to…

amazon-web-services aws-glue aws-data-pipeline

asked May 08 '21 at 02:34

Paras Pandey

3
2

0

votes

1 answer

avoid run Install Task Runner step in EMR cluster

I hope you can help me. I am trying to create EMR cluster with hadoop and spark installed using datapipeline. The problem is this EMR is private so it does not have access to internet to download anything. In pipeline I indicate bootstrap actions to…

amazon-web-services hadoop hive amazon-emr aws-data-pipeline

asked May 07 '21 at 07:52

vll1990

1
1

0

votes

0 answers

Completely deleting all resources related to AWS Glue and AWS Data Pipeline

I'm a student getting started with AWS (free tier). After realizing (I got billed) that I've exhausted my free tier for AWS Glue and Data Pipeline. I deleted all the resources that were billing me, even these two s3-buckets (mentioned in an image…

amazon-web-services amazon-s3 aws-glue aws-billing aws-data-pipeline

asked Apr 23 '21 at 08:25

Shreyas Chorge

45
4

0

votes

0 answers

Not able to read the data from hive using aws data pipeline

Using aws data pipeline, used the driver HiveJDBC4.jar and given the class name as com.amazon.hive.jdbc4.HS1Driver and trying to connect the hive tables. The connection is success, but not able to retrieve the data. Below is the error: Connecting…

amazon-web-services jdbc hive aws-data-pipeline

asked Apr 08 '21 at 08:56

jyo

11
1

0

votes

0 answers

Cralwer not creating table in data lake from postgres partition table

My Table is partitioned in postgres. I have created a Glue crawler to create table. I selected the option "Update all new and existing partitions with metadata from the table" in Configure the crawler's output. Since it's partitioned, the table is…

amazon-web-services aws-glue aws-data-pipeline aws-lake-formation

asked Apr 07 '21 at 18:48

rose1110

21
4

0

votes

0 answers

How does default PipelineObject looks like in AWS DataPipeline

I'm trying to create an aws data pipeline using aws powershell tools command. I was able to create a pipeline using New-DPPipeline command and trying to edit the pipeline using Write-DPPipelineDefinition. I'm trying to understand how PipelineObject…

amazon-web-services amazon-data-pipeline aws-data-pipeline

asked Apr 01 '21 at 04:41

Biswajit Maharana

133
1
10

0

votes

0 answers

Using AWS Data Pipeline to move data from AWS RDS to S3

I was trying to move data from RDS to S3 as backup. I used DBeaver on my local pc to establish connection with AWS RDS and uploaded a csv file. I, then, tried to create a datapipeline to send data from RDS to S3. Initially, I got an error DBInstance…

amazon-s3 amazon-rds aws-data-pipeline

asked Mar 23 '21 at 12:19

kiran

1
1

0

votes

1 answer

Batch file processing in AWS using Data Pipeline

I have a requirement of reading a csv batch file that was uploaded to s3 bucket, encrypt data in some columns and persist this data in a Dynamo DB table. While persisting each row in the DynamoDB table, depending on the data in each row, I need to…

amazon-web-services batch-file batch-processing amazon-data-pipeline aws-data-pipeline

asked Mar 21 '21 at 11:34

Shanaka

1,442
2
18
38

0

votes

0 answers

Error in attribute value of parameters in aws datapipeline put-pipeline-definition operation

I'm trying to upload aws datapipeline definition using cli. I've created a file with parameter objects that defines the variables in pipeline definition. { "parameters": [ { "id": "myShellCmd", "description": "Shell command to…

amazon-web-services aws-cli aws-data-pipeline

asked Feb 22 '21 at 16:37

Biswajit Maharana

133
1
10

0

votes

0 answers

Multithreading subprocess run for commands with massive output

I need to write a python script that would run 30-50 command line processes and store log output. To enforce things I am using ProcessPoolExecutor. For executing shell commands I am using subprocess.run. I am running all that code at data pipeline…

python multithreading shell subprocess aws-data-pipeline

asked Feb 22 '21 at 14:44

jk1

431
4
14

0

votes

0 answers

Is there an equivalent of the Azure Integration Runtime for AWS Data pipeline?

I have previously had successful implementations of data transfer from on-premise SQL Server instances to Azure SQL using the Integration Runtime component in conjunction with Azure Data Factory. I am not very familiar with AWS but from what I have…

amazon-web-services azure azure-data-factory aws-data-pipeline

asked Jan 12 '21 at 06:47

CelticGiz

1

0

votes

1 answer

AWS IAM Setup for EC2 Resource in AWS Data Pipeline

I am having an issue getting AWS Data Pipeline to run on an EC2 Instance via a Shell Command Activity. I have been following the guide found here step by step:…

amazon-web-services amazon-ec2 aws-data-pipeline

asked Dec 30 '20 at 17:18

WolVes

965
1
13
31

Questions tagged [aws-data-pipeline]