Class Lucene91HnswGraphBuilder
- java.lang.Object
-
- org.apache.lucene.backward_codecs.lucene91.Lucene91HnswGraphBuilder
-
public final class Lucene91HnswGraphBuilder extends java.lang.Object
Builder for HNSW graph. SeeHnswGraph
for a gloss on the algorithm and the meaning of the hyperparameters.
-
-
Field Summary
Fields Modifier and Type Field Description private int
beamWidth
private BoundsChecker
bound
private RandomAccessVectorValues
buildVectors
private static long
DEFAULT_RAND_SEED
Default random seed for level generation *private HnswGraphSearcher
graphSearcher
(package private) Lucene91OnHeapHnswGraph
hnsw
static java.lang.String
HNSW_COMPONENT
A name for the HNSW component for the info-stream *private InfoStream
infoStream
private int
maxConn
private double
ml
private java.util.SplittableRandom
random
static long
randSeed
Random seed for level generation; public to expose for testing *private Lucene91NeighborArray
scratch
private VectorSimilarityFunction
similarityFunction
private RandomAccessVectorValues
vectorValues
-
Constructor Summary
Constructors Constructor Description Lucene91HnswGraphBuilder(RandomAccessVectorValuesProducer vectors, VectorSimilarityFunction similarityFunction, int maxConn, int beamWidth, long seed)
Reads all the vectors from a VectorValues, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private void
addDiverseNeighbors(int level, int node, NeighborQueue candidates)
(package private) void
addGraphNode(int node, float[] value)
Inserts a doc with vector value to the graphLucene91OnHeapHnswGraph
build(RandomAccessVectorValues vectors)
Reads all the vectors from two copies of a random access VectorValues.private boolean
diversityCheck(float[] candidate, float score, Lucene91NeighborArray neighbors, RandomAccessVectorValues vectorValues)
private void
diversityUpdate(Lucene91NeighborArray neighbors)
private int
findNonDiverse(Lucene91NeighborArray neighbors)
private static int
getRandomGraphLevel(double ml, java.util.SplittableRandom random)
private void
popToScratch(NeighborQueue candidates)
private long
printGraphBuildStatus(int node, long start, long t)
private void
selectDiverse(Lucene91NeighborArray neighbors, Lucene91NeighborArray candidates)
void
setInfoStream(InfoStream infoStream)
Set info-stream to output debugging information *
-
-
-
Field Detail
-
DEFAULT_RAND_SEED
private static final long DEFAULT_RAND_SEED
Default random seed for level generation *- See Also:
- Constant Field Values
-
HNSW_COMPONENT
public static final java.lang.String HNSW_COMPONENT
A name for the HNSW component for the info-stream *- See Also:
- Constant Field Values
-
randSeed
public static long randSeed
Random seed for level generation; public to expose for testing *
-
maxConn
private final int maxConn
-
beamWidth
private final int beamWidth
-
ml
private final double ml
-
scratch
private final Lucene91NeighborArray scratch
-
similarityFunction
private final VectorSimilarityFunction similarityFunction
-
vectorValues
private final RandomAccessVectorValues vectorValues
-
random
private final java.util.SplittableRandom random
-
bound
private final BoundsChecker bound
-
graphSearcher
private final HnswGraphSearcher graphSearcher
-
hnsw
final Lucene91OnHeapHnswGraph hnsw
-
infoStream
private InfoStream infoStream
-
buildVectors
private RandomAccessVectorValues buildVectors
-
-
Constructor Detail
-
Lucene91HnswGraphBuilder
public Lucene91HnswGraphBuilder(RandomAccessVectorValuesProducer vectors, VectorSimilarityFunction similarityFunction, int maxConn, int beamWidth, long seed) throws java.io.IOException
Reads all the vectors from a VectorValues, builds a graph connecting them by their dense ordinals, using the given hyperparameter settings, and returns the resulting graph.- Parameters:
vectors
- the vectors whose relations are represented by the graph - must provide a different view over those vectors than the one used to add via addGraphNode.maxConn
- the number of connections to make when adding a new graph node; roughly speaking the graph fanout.beamWidth
- the size of the beam search to use when finding nearest neighbors.seed
- the seed for a random number generator used during graph construction. Provide this to ensure repeatable construction.- Throws:
java.io.IOException
-
-
Method Detail
-
build
public Lucene91OnHeapHnswGraph build(RandomAccessVectorValues vectors) throws java.io.IOException
Reads all the vectors from two copies of a random access VectorValues. Providing two copies enables efficient retrieval without extra data copying, while avoiding collision of the returned values.- Parameters:
vectors
- the vectors for which to build a nearest neighbors graph. Must be an independet accessor for the vectors- Throws:
java.io.IOException
-
setInfoStream
public void setInfoStream(InfoStream infoStream)
Set info-stream to output debugging information *
-
addGraphNode
void addGraphNode(int node, float[] value) throws java.io.IOException
Inserts a doc with vector value to the graph- Throws:
java.io.IOException
-
printGraphBuildStatus
private long printGraphBuildStatus(int node, long start, long t)
-
addDiverseNeighbors
private void addDiverseNeighbors(int level, int node, NeighborQueue candidates) throws java.io.IOException
- Throws:
java.io.IOException
-
selectDiverse
private void selectDiverse(Lucene91NeighborArray neighbors, Lucene91NeighborArray candidates) throws java.io.IOException
- Throws:
java.io.IOException
-
popToScratch
private void popToScratch(NeighborQueue candidates)
-
diversityCheck
private boolean diversityCheck(float[] candidate, float score, Lucene91NeighborArray neighbors, RandomAccessVectorValues vectorValues) throws java.io.IOException
- Parameters:
candidate
- the vector of a new candidate neighbor of a node nscore
- the score of the new candidate and node n, to be compared with scores of the candidate and n's neighborsneighbors
- the neighbors selected so farvectorValues
- source of values used for making comparisons between candidate and existing neighbors- Returns:
- whether the candidate is diverse given the existing neighbors
- Throws:
java.io.IOException
-
diversityUpdate
private void diversityUpdate(Lucene91NeighborArray neighbors) throws java.io.IOException
- Throws:
java.io.IOException
-
findNonDiverse
private int findNonDiverse(Lucene91NeighborArray neighbors) throws java.io.IOException
- Throws:
java.io.IOException
-
getRandomGraphLevel
private static int getRandomGraphLevel(double ml, java.util.SplittableRandom random)
-
-