Questions tagged [kedro]

Kedro is an open source Python library that helps you build production-ready data and analytics pipelines

90 questions
0
votes
0 answers

kedro-airflow creates DAGs that throw errors

I am using kedro-airflow to create a DAG for airflow but the DAG created throws an error (see below). The flow is just a test flow - very simple - and it runs without errors with kedro run. Airflow also runs other DAGS without any problem. …
nmduarte
  • 33
  • 1
  • 4
0
votes
2 answers

Data versioning of "Hello_World" tutorial

i have added "versioned: true" in the "catalog.yml" file of the "hello_world" tutorial. example_iris_data: type: pandas.CSVDataSet filepath: data/01_raw/iris.csv versioned: true Then when I used "kedro run" to run the tutorial, it has error…
Y. huang
  • 43
  • 5
0
votes
0 answers

Kedro ability to execute nodes on a multiple virtual machines

Is there any possibility to run Kedro pipeline's nodes on multiple machines to speed up the processing other than spark cluster? Or better, do you plan to support it in the near future? Thank you!
Andrej
  • 549
  • 6
  • 13
0
votes
0 answers

How can I import local package dependencies into Kedro notebooks?

I've placed package dependencies (wheels) of a Kedro project into a /deps/*.whl directory. I'm using a venv installed into /.venv and manage it using Poetry. Packages are referenced in pyproject.toml like this (here e.g.…
thinwybk
  • 2,493
  • 16
  • 40
0
votes
0 answers

Running jupyter lab in kedro project in vscode under windows not possible

On our Windows 10 machines I tried to setup kedro projects setup on Ubuntu configured for VSCode. On Ubuntu they are working just fine. However if I run kedro jupyter lab in the VSCode integrated terminal on Windows I get the following error: [C…
thinwybk
  • 2,493
  • 16
  • 40
0
votes
1 answer

Accessing Kedro CLI from an existing project

I have an existing project, cloned with git clone. After I pip install kedro I can run kedro info fine but I dont seem to have access to the projects CLI for example if I try to runkedro install I get the following error: Usage: kedro [OPTIONS]…
roman
  • 1,150
  • 8
  • 28
0
votes
1 answer

kedro error: Pipeline does not contain nodes named ['preprocess_companies']

I was following the kedro pipelines tutorial1, create all needed files, started the kedro with kedro run --node=preprocess_companies It returns the following error ValueError: Pipeline does not contain nodes named ['preprocess_companies']. did try…
0
votes
1 answer

How to snap a python package with plugin packages?

I'd like to bundle the Python package kedro which provides a command line interface (kedro). In addition I'd like to put the Python package kedro-docker into the snap as well. This second package extends the first package's command line interface…
thinwybk
  • 2,493
  • 16
  • 40
0
votes
2 answers

'kedro' is not recognized as an internal or external command, operable program or batch file

I am trying to install Kedro but I am getting this error. I know most of the time this error arises because kedro is not in my PATH. I tried adding the file path to my PATH and still getting the same error. When I run: pip show kedro output: Name:…
0
votes
1 answer

Access Kedro context from decorator

I am trying to create a decorator in which I need some information about the project and/or catalog. Is it possible to access the project context from inside of the decorator? I am looking for things like project_name, catalog entry name, and…
Waylon Walker
  • 443
  • 3
  • 10
0
votes
1 answer

How do I select which columns to load in a Kedro CSVLocalDataSet?

I have a csv file that looks like a,b,c,d 1,2,3,4 5,6,7,8 and I want to load it in as a Kedro CSVLocalDataSet, but I don't want to read the entire file. I only want a few columns (say a and b for example). Is there any way for me to specify the…
0
votes
1 answer

How to run functions from a Class in the nodes.py file?

I want to organize the node functions by Classes in the nodes.py file. For example, functions related to cleaning data are in the "CleanData" Class, with a @staticmethod decorator, while other functions will stay in the "Other" Class, without any…
0
votes
1 answer

How can we get the pipeline to read columns with special characters?

I am using the "usecols" parameter to get some columns of a .xlsx file (I am using the xls_local.py file from the Kedro tutorial) but the program says that "usecols do not match columns, columns expected but not found:" and it only shows the columns…
0
votes
3 answers

How to run a pipeline except for a few nodes?

I want to run a pipeline for different files, but some of them don't need all of the defined nodes. How can I pass them?
-3
votes
1 answer

How to pass a literal value to a node?

I've got a function def do_something(input_data, column: int): # Do something with one column of the data Now I need to create a node, but I can't do node(do_something, ["input_data", 1], "output"). How can I put the constant value into the…
1 2 3 4 5
6