Questions tagged [data-pipeline]
95 questions
-1
votes
1 answer
Data Pipeline using SQl and Python
I need to create a data pipeline using Python. I want to connect with MySql in Python and read the tables in dataframes, perform pre-processing and then load the data back to Mysql Db. I was able to connect to the MySql Db using mysql connector and…
![](../../users/profiles/15554210.webp)
Disha09
- 1
- 1
-1
votes
1 answer
Google Dataflow pipeline for varying schema
I have a product to define and configure business workflows. A part of this product is a form-builder which enables users to setup different forms.
This entire forms data is backed on MongoDB in the following structure
- form_schemas
{
"_id" :…
![](../../users/profiles/1047897.webp)
Sharath Chandra
- 490
- 4
- 18
-1
votes
1 answer
Build an end-to-end data analysis platform
I need to create an end-to-end platform:
Input data collection and storage - Data will be periodically collected via FTP and stored in cloud.
Data Analysis - The data will be analyzed (using Tableau/ any other analytics software)
Reports - Daily…
![](../../users/profiles/12902250.webp)
priya
- 53
- 1
- 6
-3
votes
2 answers
combining data from different sources in apache spark
I am exploring apache spark for a project where I want to get data from different sources - database tables (postgres and BigQuery), and text. The data will be processed and fed into another table for analytics. My choice of the programming language…
![](../../users/profiles/4122421.webp)
user4122421
- 911
- 1
- 9
- 24
-4
votes
1 answer
Add images from disk to a Tensorflow dataset
I am using Tensorflow Datasets' tfds.load function to load my data:
import tensorflow_datasets as tfds
import tensorflow as tf
(raw_train, raw_validation, raw_test), metadata = tfds.load(
'cats_vs_dogs',
split=['train[:80%]',…
![](../../users/profiles/2165335.webp)
Stat Tistician
- 522
- 3
- 14
- 30