RosettaCodeData/Task/K-d-tree/00DESCRIPTION

29 lines
2.0 KiB
Plaintext
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{{wikipedia|K-d tree}}
[[Category:Data Structures]]
A k-d tree (short for ''k''-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches).
k-d trees are a special case of binary space partitioning trees.
k-d trees are not suitable, however, for efficiently finding the nearest neighbor in high dimensional spaces. As a general rule, if the dimensionality is ''k'', the number of points in the data, ''N'', should be ''N''2<sup>''k''</sup>.
Otherwise, when k-d trees are used with high-dimensional data, most of the points in the tree will be evaluated and the efficiency is no better than exhaustive search, and other methods such as approximate nearest-neighbor are used instead.
'''Task:''' Construct a k-d tree and perform a nearest neighbor search for two example data sets:
# The Wikipedia example data of [(2,3), (5,4), (9,6), (4,7), (8,1), (7,2)].
# 1000 3-d points uniformly distributed in a 3-d cube.
For the Wikipedia example, find the nearest neighbor to point (9, 2)
For the random data, pick a random location and find the nearest neighbor.
In addition, instrument your code to count the number of nodes visited in the nearest neighbor search. Count a node as visited if any field of it is accessed.
Output should show the point searched for, the point found,
the distance to the point, and the number of nodes visited.
There are variant algorithms for constructing the tree.
You can use a simple median strategy or implement something more efficient.
Variants of the nearest neighbor search include nearest N neighbors, approximate nearest neighbor, and range searches.
You do not have to implement these.
The requirement for this task is specifically the nearest single neighbor.
Also there are algorithms for inserting, deleting, and balancing k-d trees.
These are also not required for the task.