Questions tagged [lsa]

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

LSA stands for Latent Semantic Analysis, a natural language processing technique which involves analysing the relationships between documents and terms they contain by producing a set of related concepts.

For the Microsoft Windows subsystem, see (local-security-authority).

113 questions
0
votes
0 answers

Performing SVD Feature Decompostion on a Large Sparse Matrix

I saved my features from text data with pickle in sparse matrix format with a shape of (323549, 4119259). I am trying to perform Singular Value Decomposition on them using the sklearn library, however, I keep getting a memory error which suggests…
zzenonn
  • 56
  • 6
0
votes
1 answer

Latent Semantic Analysis and Stemming

Assume a very large corpus of any inflective language. Does the following make sense? By applying LSA on such corpus, words with similar concepts converge together in vector space, thus inflected word forms reffering to the same concept should…
L D
  • 190
  • 1
  • 1
  • 13
0
votes
1 answer

Transforming words into Latent Semantic Analysis (LSA) Vectors

Does anyone have any suggestions for how to turn words from a document into LSA vectors using Python and scikit-learn? I found these site here and here that decscribe how to turn a whole document into an lsa vector but I am interested in converting…
Sigmund Reed
  • 77
  • 1
  • 8
0
votes
1 answer

different approach for document similarity(LDA, LSA, cosine)

I have set of short documents(1 or 2 paragraph each). I have used three different approaches for document similarity: - simple cosine similarity on tfidf matrix - applying LDA on the whole corpus and then using the LDA model to create the vector for…
Eli
  • 93
  • 12
0
votes
1 answer

HashingTF not giving unique indices

i am implementing Latent Semantic Analysis LSA, using eclipse Mars, java 8, and spark spark-assembly-1.6.1-hadoop2.4.0.jar I passed the documents as tokens , then got SVD and so on HashingTF hf = new HashingTF(hashingTFSize); …
Yas
  • 21
  • 5
0
votes
1 answer

How to get similarity from LSA

I am working on latent semantic analysis, i am trying to get similarity from 2 documents. I run my code of latent semantic analysis on Python and when i run it i get : Here are the singular values [ 0.7376057 0.4596623 0.25422212] Here are the…
YayaYaya
  • 85
  • 3
  • 10
0
votes
1 answer

Scikit-learn TruncatedSVD documentation

I plan to use sklearn.decomposition.TruncatedSVD to perform LSA for a Kaggle competition, I know the math behind SVD and LSA but I'm confused by scikit-learn's user guide, hence I'm not sure how to actually apply TruncatedSVD. In the doc, it states…
howard
  • 247
  • 2
  • 11
0
votes
0 answers

how to use LSA for dimension reduction in text analytics with R

I am a beginner at data science, and I am working on a text analytics/sentiment analysis project with tweets. what i have been trying to do is to perform some dimension reduction on my tweets training set, and feed the training set into a NaiveBayes…
0
votes
1 answer

SVD in LSI in the book Introduction to Information Retrieval

In the example 18.4 of the book Introduction to Information Retrieval. The term-document matrix is decomposed using SVD. My question is why Σ is a 5*5 matrix in the example? Shouldn't it be a 5*6 matrix? Is it wrong? Here is the link of the Chapter…
clement116
  • 177
  • 1
  • 9
0
votes
1 answer

Encoding issue R LSA

dosen't lsa in r support foreign language my code library("lsa") Loading required package: SnowballC trm = textmatrix("s/") the error [lsa] - could not open file s/s.txt due to encoding problems of the file. or am doing something wrong the file…
The6thSense
  • 7,073
  • 6
  • 28
  • 62
0
votes
1 answer

LSA Similarity interface

I am a PhD student in translation studies and I am currently working on my dissertation. I am using LSA Similarity interface as a method of analysis in my dissertation. My background is in linguistics and not computer science. I tried to find an…
0
votes
1 answer

Probabilistic Latent Semantic Analysis

I am looking for any tutorial or implementation of PLSA in java. There is a similar question in this link https://stackoverflow.com/questions/16396463/probabilistic-latent-semantic-analysis-indexing-in-java , however, there is no reply to this…
Abhishek Dewan
  • 129
  • 1
  • 2
  • 5
0
votes
1 answer

InstallShield calling advapi32.dll method type mismatch error

I am trying to call Advapi32.LsaOpenPolicy() from basic MSI InstallShield code. I've successfully called other avdapi32.dll methods; But LsaOPenPolicy is throwing a mismatched type error. My prototype is: prototype INT…
0
votes
1 answer

Is there a memory implementation of the SparseVectorsFromSequenceFiles, RowIdJob and RowSimilarityJob jobs

I've been working on performing Latent Semantic Analysis using the SparseVectorsFromSequenceFiles, RowIdJob and RowSimilarityJob Hadoop jobs provided by Mahout, which run Map/Reduce jobs. I've been trying to find an equivalent implementation for…
0
votes
1 answer

How to calculate LSA Word score as seen in "LSA Intro AI Seminar"

If you check http://www.cs.nmsu.edu/~mmartin/LSA_Intro_AI_Seminar.ppt they show the calculated score for each word on Slide 25. I have not been able to find how to calculate this summary. Recently, I have completed a LSA implementation and can…
NKCSS
  • 2,641
  • 1
  • 19
  • 37