Professional Documents
Culture Documents
TRACEABILITY
WITH LSI
Traceability
Traceability
The ability to link between different artifacts
Example artifacts: code, user manuals, design
documentation, development wikis, etc.
In particular, link code to:
Relevant requirements
Sections in design documents
Test-cases
Other structured and free-text artifacts
Also, link from requirements, design
documents, etc. to code
/**
* Creates a new File…
*/
public FileImpl(String nativePath
...){
…
}
/**
*…
*/
Private String f(..){…}
}
Code Path
Document Scored Document
Text Normalization
/**
* Creates a new File… words
*/ extracti
public FileImpl(String nativePath on
...){
…
}
/**
*…
*/
Private String f(..){…}
}
•Class Name
•Public Function names
•Public function arguments and return
type
•Comments
A Comparison of Traceability Techniques for Specifications 10
Words Expansion
13
Vector Space Model
■ Classic IR might lead to poor retrieval due to:
◆ unrelated documents might be included in the
answer set
◆ relevant documents that do not contain at least
one index term are not retrieved
◆ Reasoning: retrieval based on index terms is
vague and noisy
14
Example
from Lillian Lee
Synonymy Polysemy
Will have small cosine Will have large cosine
but are related but not truly related
Latent Semantic Indexing
Latent Semantic
Indexing
■ The user information need is more related to concepts
and ideas than to index terms
■ A document that shares concepts with another
document known to be relevant might be of interest
http://lsi.research.telcordia.com/
m1: The generation of random, binary, ordered trees
m2: The intersection graph of paths in trees
m3: Graph minors IV: Widths of trees and wellquasiordering
m4: Graph minors: A survey
Computing an Example
M=
S=
Computing an Example
D=
Computing an Example
New M=
Recall @ n:
Average precision: