0

To learn about fetch, promise, and other js stuff I'm trying to write a small script that suggests words to learn (base on its difficulty) from a given Japanese text.

It utilizes a Japanese parser called Kuromojin.

What a parser like Kuromojin does is that it tokenizes a phrase into words.

e.g. data: "日本語が上手ですね!" → tokenied words: [{surface_form: 日本語}, {surface_form: が}, {surface_form: 上手}, {surface_form: です}, {surface_form: ね}, {surface_form: !}]

The script first tokenizes words from data, then fetch each tokenized word's corresponding JLPT level by using a Japanese dict's API (jisho.org)

The app.js file:

import {  tokenize,  getTokenizer} from "kuromojin";
import {  parseIntoJisho} from "./parseintojisho.js";

const text = "日本語が上手ですね!";
let tokenSet = new Set();

getTokenizer().then(tokenizer => {});

tokenize(text).then(results => { // the result is an array of objects

  results.forEach((token) => {
    tokenSet.add(token.surface_form);
  })

  console.log("======Begin to find JLPT level for below items======");
  tokenSet.forEach((item) => {   
    console.log(item);

    parseIntoJisho(item); // will throw a console log of whichever item's corresponding data is resolved, regardless of item order here, right?
  });
})

The parseintojisho.js file:

import fetch from 'node-fetch';

/* @param data is string */


export function parseIntoJisho(data) {
  fetch(encodeURI(`https://jisho.org/api/v1/search/words?keyword=${data}`))
    .then(res => res.json())
    .then(jsondata => {
      let JLPTvalueArray = jsondata.data[0].jlpt;

      if (JLPTvalueArray.length) {
        let jlptLevel = JLPTvalueArray.flatMap(str => str.match(/\d+/));
        const max = Math.max(...jlptLevel);
        if (max >= 3) {
          console.log(data + " is of JLPT level N3 or above.")
        } else console.log(data + " is of JLPT level N1 or N2.");
      } else console.log(data + " has no JLPT value.")
    })
    .catch(function(err){
      console.log("No data for " + data);
    })
}

The script works but instead of showing the JLPT level corresponding to each tokenized word in order, it shows randomly. I guess whichever corresponding data is resolved first will appear in the console log?

What I've found out is that Promise.All() may solve my problem, but I couldn't find out a way to implement it correctly.

Is there a way to put the fetched JLPT levels in the order of the tokenized items that were passed into parseIntoJisho(item);?

$ node app.js
======Begin to find JLPT level for below items======
日本語
が
上手
です
ね
!

です has no JLPT value. // should be "日本語 has no JLPT value." here instead
No data for ! // should be "が has no JLPT value." here instead
上手 is of JLPT level N3 or above. 
日本語 has no JLPT value. // should be "です has no JLPT value." here instead
ね is of JLPT level N3 or above.
が has no JLPT value. // should be "No data for !" here instead
aanhlle
  • 17
  • 8

1 Answers1

1

Use .map instead of forEach, then log each item of the resulting array after all asynchronous requests are done.

To pass along the original string into the asynchronous results function, use Promise.all on both the parseIntoJisho and the original data.

Promise.all([...tokenSet].map(parseIntoJisho))
  .then((results) => {
    for (const [data, result] of results) {
      if (!result) continue; // there was an error
      if (result.length) {
        const jlptLevel = result.flatMap(str => str.match(/\d+/));
        const max = Math.max(...jlptLevel);
        if (max >= 3) {
          console.log(data + " is of JLPT level N3 or above.")
        } else console.log(data + " is of JLPT level N1 or N2.");
      } else console.log(data + " has no JLPT value.")
    }
  });
const parseIntoJisho = data => Promise.all([
  data,
  fetch(encodeURI(`https://jisho.org/api/v1/search/words?keyword=${data}`))
    .then(res => res.json())
    .then(jsondata => jsondata.data[0].jlpt)
    .catch(err => null)
]);
CertainPerformance
  • 260,466
  • 31
  • 181
  • 209
  • Hi, thanks for your answer! I use `const tokenSetArray = [...tokenSet];` in order to use the `.map` for tokenSet. However, I've tried some ways but couldn't work your answer out. Particular with `const parseIntoJisho = data =>` I get the error data is undefined. Could you elaborate a bit more in your answer (particularly how to embed them into my code) ? Thank you! – aanhlle Mar 18 '21 at 15:54
  • If after `parseIntoJisho = data =>` the data is undefined, then it looks like you aren't using the code in the answer that calls it with `tokenSet.map(parseIntoJisho)` - that'll pass the argument along. Unless the Set literally contains `undefined` values, of course. After adding the tokens to the Set, replace all the code inside `tokenize(text).then` with the code in my answer – CertainPerformance Mar 18 '21 at 15:57
  • Thanks. I will try it asap and reply back. One small thing is that tokenSet is a set, so I should parse them into array first before using .map right? Cause in your answer I see you use .map with a tokenSet, which is a set and have no .map method as far as I know. – aanhlle Mar 18 '21 at 16:15
  • Ah, right, definitely, turn it into an array first. – CertainPerformance Mar 18 '21 at 16:16
  • hi, hope that you are still there xD I've tried to do exactly as you said but an error showed up: "Cannot access 'parseIntoJisho' before initialization. I've tried some ways like put the `const parseIntoJisho` above the Promise.all() but got another error (data is undefined). My code is here: https:// paste.ofcode.org/Vg7RbGX5ZGNga3VmjHf5RS – aanhlle Mar 19 '21 at 03:21
  • Variables declared with `const` or `let` can't be referenced before the line that initializes them runs. I'd move the `parseIntoJisho` out of the function onto an outer level. Or you can declare `parseIntoJisho` before the `Promise.all` part – CertainPerformance Mar 19 '21 at 03:23
  • Hi, I've tried to move the `const parseIntoJisho = data =>` out of the function, even out of the bracket to initialize it but always get the error property `jlpt` is undefined. I guess it's because when we do `const parseIntoJisho = data =>` the initial data that is used to pass into fetch is always the value of `const parseIntoJisho ` which is undefined. Could you please take another look at your code to verify its integrity? I really appreciate your help. – aanhlle Mar 19 '21 at 03:37
  • The `const parseIntoJisho` is the function, it should never be undefined. The `data` argument should always be defined too if the elements of the Set are all defined, `[...tokenSet].map(parseIntoJisho)` will pass each item of the Set as the first argument to the function. What exactly is the error message you're getting? Even if `.jlpt` were undefined, accessing it alone shouldn't throw an error – CertainPerformance Mar 19 '21 at 03:40
  • I got `TypeError: cannot read property 'jlpt' of undefined` then below is `UnhandledPromisingRejectionWarning: Referenced Error: data is not defined` my code: https:// paste.ofcode.org/38upuYnXsSek2AmrFmtbBPM – aanhlle Mar 19 '21 at 03:48
  • Use the `result` variable name instead, since that's the loop parameter – CertainPerformance Mar 19 '21 at 03:49
  • hi, thanks. The original variable `data` is used to log the value of the tokenSet. If using `result` then it will log the jlpt value. `.then((results)` will only pass results which contain the jlpt value, how can I tap into the original tokenized data to show it ? e.g `result + " is of JLPT level N3 or above."` === `jlpt-n5 is of level N3 or above`. I want `日本語 is of level N3 or above`. Is there a way to pass which data is mapped in map(parseIntoJisho) (so I could tap into that data instead of result)? – aanhlle Mar 19 '21 at 04:17
  • Looks like you'll need another `Promise.all` – CertainPerformance Mar 19 '21 at 04:22