Questions tagged [amazon-kinesis]

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale.

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.

With Amazon Kinesis applications, you can build real-time dashboards, capture exceptions and generate alerts, drive recommendations, and make other real-time business or operational decisions. You can also easily send data to a variety of other services such as Amazon Simple Storage Service (Amazon S3), Amazon DynamoDB, or Amazon Redshift. In a few clicks and a couple of lines of code, you can start building applications which respond to changes in your data stream in seconds, at any scale, while only paying for the resources you use.

Useful links

1434 questions
178
votes
11 answers

Why should I use Amazon Kinesis and not SNS-SQS?

I have a use case where there will be stream of data coming and I cannot consume it at the same pace and need a buffer. This can be solved using an SNS-SQS queue. I came to know the Kinesis solves the same purpose, so what is the difference? Why…
Apoorv
  • 2,055
  • 2
  • 11
  • 18
49
votes
13 answers

Application report for application_ (state: ACCEPTED) never ends for Spark Submit (with Spark 1.2.0 on YARN)

I am running kinesis plus spark application https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html I am running as below command on ec2 instance : ./spark/bin/spark-submit --class org.apache.spark.examples.streaming.myclassname…
Sam
  • 1,271
  • 5
  • 21
  • 35
42
votes
2 answers

What is partition key in AWS Kinesis all about?

I was reading about AWS Kinesis. In the following program, I write data into the stream named TestStream. I ran this piece of code 10 times, inserting 10 records into the stream. var params = { Data: 'More Sample data into the test stream ...', …
Suhail Gupta
  • 19,563
  • 57
  • 170
  • 298
33
votes
3 answers

TRIM_HORIZON vs LATEST

I can't find in the formal documentation of AWS Kinesis any explicit reference between TRIM_HORIZON and the checkpoint, and also any reference between LATEST and the checkpoint. Can you confirm my theory: TRIM_HORIZON - In case the application-name…
Ida Amit
  • 1,077
  • 1
  • 8
  • 20
25
votes
2 answers

Amazon Kinesis & AWS Lambda Retries

I'm very new to Amazon Kinesis so maybe this is just a problem in my understanding but in the AWS Lambda FAQ it says: The Amazon Kinesis and DynamoDB Streams records sent to your AWS Lambda function are strictly serialized, per shard. This means…
Stefano
  • 1,331
  • 1
  • 12
  • 12
24
votes
9 answers

Reading the data written to s3 by Amazon Kinesis Firehose stream

I am writing record to Kinesis Firehose stream that is eventually written to a S3 file by Amazon Kinesis Firehose. My record object looks like ItemPurchase { String personId, String itemId } The data is written to S3 looks…
learner_21
  • 463
  • 1
  • 4
  • 11
22
votes
1 answer

Difference between Kinesis Stream and DynamoDB streams

They seem to be doing the same thing to me. Can anyone explain to me the difference?
Junji Zhi
  • 1,032
  • 1
  • 10
  • 17
21
votes
3 answers

Amazon Kinesis and guaranteed ordering

Amazon claims their Kinesis streaming product guarantees record ordering. It provides ordering of records, as well as the ability to read and/or replay records in the same order (...) Kinesis is composed of Streams that are themselves composed of…
Dante
  • 3,557
  • 4
  • 32
  • 54
21
votes
3 answers

AWS Lambda can't connect to RDS instance, but I can locally?

I am trying to connect to my RDS instance from a lambda. I wrote the lambda locally and tested locally, and everything worked peachy. I deploy to lambda, and suddenly it doesn't work. Below is the code I'm running, and if it helps, I'm invoking the…
20
votes
3 answers

spark streaming checkpoint recovery is very very slow

Goal: Read from Kinesis and store data in to S3 in Parquet format via spark streaming. Situation: Application runs fine initially, running batches of 1hour and the processing time is less than 30 minutes on average. For some reason lets say the…
19
votes
2 answers

Equivalent for Kafka / AWS Kinesis Stream on Google Cloud Platform

I'm building an app that is constantly appending to a buffer while many readers consume from this buffer independently (write-once-read-many / WORM). At first I thought of using Apache Kafka, but as I prefer an as-a-service option I started…
19
votes
2 answers

How do you handle Amazon Kinesis Record duplicates?

According to the Amazon Kinesis Streams documentation, a record can be delivered multiple times. The only way to be sure to process every record just once is to temporary store them in a database that supports Integrity checks (e.g. DynamoDB,…
18
votes
3 answers

Can I automatically append newlines to AWS Firehose records?

I am trying to configure a Kinesis Analytics application with the following settings: Input stream is a Kinesis Firehose which is taking stringified JSON values The SQL is a simple passthrough (it needs to be more complicated later but for testing,…
MrHen
  • 2,222
  • 1
  • 23
  • 37
18
votes
3 answers

How to fanout an AWS kinesis stream?

I'd like to fanout/chain/replicate an Input AWS Kinesis stream To N new Kinesis streams, So that each record written to the input Kinesis will appear in each of the N streams. Is there an AWS service or an open source solution? I prefer not to…
Gili Nachum
  • 3,982
  • 2
  • 27
  • 29
18
votes
3 answers

Amazon KCL Checkpoints and Trim Horizon

How are checkpoints and trimming related in AWS KCL library? The documentation page Handling Startup, Shutdown, and Throttling says: By default, the KCL begins reading records from the tip of the stream;, which is the most recently added record.…
Edmondo1984
  • 17,841
  • 12
  • 55
  • 99
1
2 3
95 96