Kedro is an open source Python library that helps you build production-ready data and analytics pipelines
Questions tagged [kedro]
90 questions
0
votes
0 answers
kedro-airflow creates DAGs that throw errors
I am using kedro-airflow to create a DAG for airflow but the DAG created throws an error (see below). The flow is just a test flow - very simple - and it runs without errors with kedro run. Airflow also runs other DAGS without any problem.
…
![](../../users/profiles/3447472.webp)
nmduarte
- 33
- 1
- 4
0
votes
2 answers
Data versioning of "Hello_World" tutorial
i have added "versioned: true" in the "catalog.yml" file of the "hello_world" tutorial.
example_iris_data:
type: pandas.CSVDataSet
filepath: data/01_raw/iris.csv
versioned: true
Then when I used
"kedro run" to run the tutorial, it has error…
![](../../users/profiles/9297767.webp)
Y. huang
- 43
- 5
0
votes
0 answers
Kedro ability to execute nodes on a multiple virtual machines
Is there any possibility to run Kedro pipeline's nodes on multiple machines to speed up the processing other than spark cluster?
Or better, do you plan to support it in the near future?
Thank you!
![](../../users/profiles/1838770.webp)
Andrej
- 549
- 6
- 13
0
votes
0 answers
How can I import local package dependencies into Kedro notebooks?
I've placed package dependencies (wheels) of a Kedro project into a /deps/*.whl directory. I'm using a venv installed into /.venv and manage it using Poetry.
Packages are referenced in pyproject.toml like this (here e.g.…
![](../../users/profiles/5308983.webp)
thinwybk
- 2,493
- 16
- 40
0
votes
0 answers
Running jupyter lab in kedro project in vscode under windows not possible
On our Windows 10 machines I tried to setup kedro projects setup on Ubuntu configured for VSCode. On Ubuntu they are working just fine. However if I run kedro jupyter lab in the VSCode integrated terminal on Windows I get the following error:
[C…
![](../../users/profiles/5308983.webp)
thinwybk
- 2,493
- 16
- 40
0
votes
1 answer
Accessing Kedro CLI from an existing project
I have an existing project, cloned with git clone.
After I pip install kedro I can run kedro info fine but I dont seem to have access to the projects CLI for example if I try to runkedro install I get the following error:
Usage: kedro [OPTIONS]…
![](../../users/profiles/3115675.webp)
roman
- 1,150
- 8
- 28
0
votes
1 answer
kedro error: Pipeline does not contain nodes named ['preprocess_companies']
I was following the kedro pipelines tutorial1, create all needed files, started the kedro with
kedro run --node=preprocess_companies
It returns the following error ValueError: Pipeline does not contain nodes named ['preprocess_companies']. did try…
0
votes
1 answer
How to snap a python package with plugin packages?
I'd like to bundle the Python package kedro which provides a command line interface (kedro). In addition I'd like to put the Python package kedro-docker into the snap as well. This second package extends the first package's command line interface…
![](../../users/profiles/5308983.webp)
thinwybk
- 2,493
- 16
- 40
0
votes
2 answers
'kedro' is not recognized as an internal or external command, operable program or batch file
I am trying to install Kedro but I am getting this error. I know most of the time this error arises because kedro is not in my PATH. I tried adding the file path to my PATH and still getting the same error.
When I run:
pip show kedro
output:
Name:…
0
votes
1 answer
Access Kedro context from decorator
I am trying to create a decorator in which I need some information about the project and/or catalog. Is it possible to access the project context from inside of the decorator? I am looking for things like project_name, catalog entry name, and…
![](../../users/profiles/7159026.webp)
Waylon Walker
- 443
- 3
- 10
0
votes
1 answer
How do I select which columns to load in a Kedro CSVLocalDataSet?
I have a csv file that looks like
a,b,c,d
1,2,3,4
5,6,7,8
and I want to load it in as a Kedro CSVLocalDataSet, but I don't want to read the entire file. I only want a few columns (say a and b for example).
Is there any way for me to specify the…
![](../../users/profiles/3437494.webp)
Anton Kirilenko
- 117
- 5
0
votes
1 answer
How to run functions from a Class in the nodes.py file?
I want to organize the node functions by Classes in the nodes.py file. For example, functions related to cleaning data are in the "CleanData" Class, with a @staticmethod decorator, while other functions will stay in the "Other" Class, without any…
![](../../users/profiles/12327590.webp)
sofiacosta29
- 29
- 6
0
votes
1 answer
How can we get the pipeline to read columns with special characters?
I am using the "usecols" parameter to get some columns of a .xlsx file (I am using the xls_local.py file from the Kedro tutorial) but the program says that "usecols do not match columns, columns expected but not found:" and it only shows the columns…
![](../../users/profiles/12327590.webp)
sofiacosta29
- 29
- 6
0
votes
3 answers
How to run a pipeline except for a few nodes?
I want to run a pipeline for different files, but some of them don't need all of the defined nodes. How can I pass them?
![](../../users/profiles/12327590.webp)
sofiacosta29
- 29
- 6
-3
votes
1 answer
How to pass a literal value to a node?
I've got a function
def do_something(input_data, column: int):
# Do something with one column of the data
Now I need to create a node, but I can't do node(do_something, ["input_data", 1], "output"). How can I put the constant value into the…
![](../../users/profiles/3437494.webp)
Anton Kirilenko
- 117
- 5