Building Modern Data Streaming Architecture on AWS

Sep 23, 2022

6 MIN READ

Table Of Contents

Example H2

This is a div block with a Webflow interaction that will be triggered when the heading is in the view.

Modernize your cloud. Maximize business impact.

Book a call

Share this article

What’s new in AWS Kinesis?

Amazon Kinesis Data firehose now also supports dynamic partitioning, where it continuously groups in transit data using dynamically or statically defined data keys and delivers the data into the individual amazon S3 prefixes by key. This reduces time to insight by minutes, it also reduces the cost, and overall simplifies the architecture.

Working with Streaming Data

For working with streaming data using Apache Flink, we also have AWS kinesis data analytics service, as with Amazon Kinesis Datastream like Kinesis Data Firehose, this service is also a fully managed, serverless, Apache Flink environment to perform stateful processing with sub-second latency. It integrates with several AWS services, supports custom connectors, and has a notebook interface called KDA Studio (Kinesis Data Analytics Studio), a managed Apache Zeppelin notebook, to allow you to interact with streaming data.

Similar to Kinesis Data Analytics for Apache Flink, Amazon managed stream for Apache Kafka or MSK is a fully managed service for running highly available, event-driven, Apache Kafka applications.

Amazon MSK operates, maintains, and scales Apache Kafka clusters, provides enterprises with security features and supports Kafka connect, and also has multiple built-in AWS integrations.

Architecture for Real-Time Reporting

Here we derive insights from input data that are coming from diverse sources or generating near real-time dashboards. With the below architecture what you are seeing is, that you can stream near real-time data from source systems such as social media applications using Amazon MSK, Lambda, and Kinesis Data Firehose into Amazon S3, you can then use AWS glue for Data Processing and Load, Transform data into Amazon redshift using an AWS glue developed endpoint such as an Amazon Sagemaker Notebook. Once data is in Amazon Redshift, you can create a customer-centric business report using Amazon Quick sight.

This architecture helps in identifying an act on deviation from the forecasted data in near real-time. In the below architecture, data is collected from multiple sources using Kinesis Data Stream, it is then persisted in Amazon S3 by Kenisis Data firehose, initial data aggregation, and preparation is done using Amazon Athena and then stored in the AWS S3. Amazon Sagemaker is used to train a forecasting model and create behavioral predictions. As new data arrives it is aggregated and prepared in real-time by Kinesis Data Analytics. The results are compared to the previously generated forecast, Amazon Cloud Watch is used to store the forecast and actual value as metrics, and when actual value deviates and cloud watch alarms trigger an incident in AWS Systems Manager, Incident manager.

Real-time reporting

Architecture for Monitoring Streaming Data with Machine Learning

Monitoring streaming data

Conclusion

The key considerations, when working with AWS Streaming Services and Streaming Applications. When you need to choose a particular service or build a solution

Usage Patterns‍

Kinesis Data Stream is for collecting and storing data, and Kinesis Data Firehose is primarily for Loading and Transforming Data Streams into AWS Data Stores and Several Saas, endpoints. Kinesis Data Analytics essentially analyzes streaming data.

Throughput

Kinesis streams scale with shards and support up to 1Mb payloads, as mentioned earlier, you have a provisioning mode and an on-demand mode for scaling shard capacity. Kinesis firehose automatically scales to match the throughput of your data. The maximum streaming throughput a single Kinesis Data Analytics for SQL application can process is approximately 100 Mbps.

Latency

‍Kinesis Streams allows data delivery from producers to consumers in less than 70 milliseconds.

Ease of use and cost

‍All the streaming services on AWS are managed and serverless, including Amazon MSK serverless, this allows for ease of use by abstracting away the infrastructure management overhead and of course, considering the pricing model of each service for your unique use case.

‍

Building Modern Data Streaming Architecture on AWS

Get started on your cloud modernization journey today!

Let Cloudtech build a modern AWS infrastructure that’s right for your business.

Book Now

Business concern	Storage class fit	Why it matters
Fast app performance	Amazon S3 Standard, Amazon S3 Express One Zone	Supports patient-facing or revenue-generating systems
Unpredictable file usage	Amazon S3 Intelligent-Tiering	Reduces manual tuning and surprise costs
Compliance + long-term storage	Amazon S3 Glacier	Cuts archival costs by up to 70%
Archived data, fast access	Amazon Glacier Instant Retrieval	Balances performance and savings

Building Modern Data Streaming Architecture on AWS

Modernize your cloud. Maximize business impact.

What’s new in AWS Kinesis?

Working with Streaming Data

Architecture for Real-Time Reporting

Real-time reporting

Architecture for Monitoring Streaming Data with Machine Learning

Monitoring streaming data

Conclusion

Usage Patterns‍

Throughput

Latency

Ease of use and cost

Related Resources

What is Conversational AI in healthcare?

Key use cases of Conversational AI in healthcare

1. 24/7 patient support and virtual assistants

2. Appointment scheduling and reminders

3. Symptom triage and guidance

4. Medication management

5. Remote patient monitoring and chronic care management

Real-world applications of Conversational AI in healthcare

Challenges and considerations for implementing Conversational AI in healthcare

1. Data privacy and security concerns

2. Integration with existing healthcare systems

3. High Initial Setup Costs and ROI Concerns

4. Training and Adoption by Healthcare Staff

5. Ensuring Accuracy and Quality of AI Responses

6. Ethical and Legal Concerns

7. Continuous Monitoring and Improvement

The future of Conversational AI in healthcare

1. Proactive health monitoring and preventive care

2. AI-driven personalized medicine

3. Enhanced virtual health assistants and voice agents

4. AI in remote care and telemedicine

5. Multilingual support for global healthcare access

6. Seamless integration with EHRs and other healthcare systems

7. AI for healthcare fraud detection and compliance

8. AI-Powered medical research assistance

Conclusion

Frequently Asked Questions (FAQs)

1. What is Conversational AI in healthcare?

2. How can Conversational AI improve patient care?

3. Is Conversational AI secure for handling patient data?

4. Can Conversational AI integrate with existing healthcare systems?

What is Amazon S3 Glacier storage, and why do SMBs need it?

How can SMBs choose the right Amazon S3 Glacier class for their data?

How to successfully set up and manage Amazon S3 Glacier storage?

How to optimize storage with Amazon S3 Glacier lifecycle policies?

What type of businesses benefit the most from Amazon S3 Glacier?

How Cloudtech helps SMBs solve data storage challenges?

Conclusion

FAQ’s

1. What is the primary benefit of using Amazon S3 Glacier?

2. Is the Amazon S3 Glacier free?

3. How to change Amazon S3 to Amazon Glacier?

4. Is Amazon S3 no longer global?

5. What is a vault in Amazon S3 Glacier?

What is Amazon S3?

How does Amazon S3 help SMBs improve scalability and performance?

Choosing the right Amazon S3 storage option

How does Amazon S3 integrate with other AWS services?

Best practices SMBs can follow to optimize Amazon S3 benefits

How Cloudtech helps businesses implement Amazon S3?

Conclusion

FAQ’s

1. Is AWS S3 a database?

2. What is S3 best used for?

3. What is the difference between S3 and DB?

4. What does S3 stand for?

5. What is S3 equivalent to?

Get started on your cloud modernization journey today!