1

I'm trying to dump a jena database as triples.

There seems to be a command that sounds perfectly suited to the task: tdb2.dump

jena@debian-clean:~$ ./apache-jena-3.8.0/bin/tdb2.tdbdump --help
tdbdump : Write a dataset to stdout (defaults to N-Quads)
  Output control
      --output=FMT           Output in the given format, streaming if possible.
      --formatted=FMT        Output, using pretty printing (consumes memory)
      --stream=FMT           Output, using a streaming format
      --compress             Compress the output with gzip
  Location
      --loc=DIR              Location (a directory)
      --tdb=                 Assembler description file
  Symbol definition
      --set                  Set a configuration symbol to a value
      --mem=FILE             Execute on an in-memory TDB database (for testing)
      --desc=                Assembler description file
  General
      -v   --verbose         Verbose
      -q   --quiet           Run with minimal output
      --debug                Output information for debugging
      --help
      --version              Version information
      --strict               Operate in strict SPARQL mode (no extensions of any kind)
jena@debian-clean:~$

But I've not succeded in getting it to write anything to STDOUT.

When I use the --loc parameter to point to a DB, a new copy of that DB appears in the subfolder: Data-0001, but nothing appears in STDOUT.

When I try the --tdb parameter, and point it to a ttl file, I get a stack trace complaining about its formatting.

Google has turned up the Jena documentation telling me the command exists, and that's it. So any help appreciated.

Ben Hillier
  • 2,051
  • 1
  • 8
  • 14
  • 1
    `--loc` shoud be the same as used to create the database. Suppose that's "DB2". For TDB2 (not TDB1) after the database is created, then `DB2/Data-0001` will already exist. Do not use this for `--loc`. If it is a TDB1 database (the files are in the directory at "--loc"), the use `tdbdump`. An empty database has no triples/quads in it so you would get no output. – AndyS Sep 08 '20 at 17:56
  • @AndyS I could query my database in Fuseki, so it absolutely contained triples before I ever tried to dump it. I was quite convinced this was a tdb2 database, but seeing your comments, I'm starting to doubt myself. I'll see if thats the issue. Thanks! – Ben Hillier Sep 10 '20 at 06:35
  • 1
    FYI Fuseki next release detects existing database type. Currently, it has to be called with the same setup each time it is run which is fragile regarding TDB1/TDB2. KIf you created the TDB2 database outside Fuseki and only use command line args, you'll need "--tdb2". – AndyS Sep 10 '20 at 10:17
  • OK! Thanks @AndyS. Put any of this in an answer, and I'll give you the 25 points :-) – Ben Hillier Sep 10 '20 at 11:28

1 Answers1

2

"--loc" should be the same as used to create the database.

Suppose that's "DB2". For TDB2 (not TDB1) after the database is created, then "DB2/Data-0001" will already exist. Do not use this for --loc. Use "--loc DB2".

If it is a TDB1 database (the files are in the directory at "--loc", no "Datat-0001"), the use tdbdump. An empty database has no triples/quads in it so you would get no output.

Fuseki currently (up to 3.16.0) has to be called with the same setup each time it is run, which is fragile regarding TDB1/TDB2. If you created the TDB2 database outside Fuseki and only use command line args, you'll need "--tdb2" each time.

Fuseki in next release (3.17.0) detects existing database type.

AndyS
  • 14,989
  • 15
  • 20