Projections¶
Projections are specialized copies of data optimized for specific search tasks. Instead of searching directly in the main database, Luna-Vinder creates projections that contain only the data necessary for particular use cases. This makes searches significantly faster and more efficient while reducing the load on the primary data storage.
Projection Concept¶
A projection is essentially a filtered and optimized view of your event data. When you create a projection, you define several key parameters: which events to include in the projection using filters, which attributes of those events to save through targets, where to get the source data by specifying the origin, and how to organize this projection by choosing its type. This approach allows you to create multiple projections, each tailored to a specific search scenario and containing only relevant information.
The main advantage of projections lies in the fact that they contain only pre-filtered data with a selected set of attributes. This radically reduces the volume of data that needs to be processed during search. For example, if you need to search only among events from a specific source over the last month, the projection will contain only those events, not all millions or billions of records in the primary storage. It’s important to understand that projections are working copies of data, while the source of truth remains the primary data storage. Projections can be recreated, deleted, and modified without risk of losing original data.
Projection Components¶
Origin (Data Source)¶
Origin determines where Projector will extract data for the projection. This can be any data storage integrated with Luna-Vinder. Currently, the events source is supported, which provides access to system events with their descriptors and metadata. The choice of origin determines not only the data source but also the attributes available for filtering and selection, as different sources have different data structures.
Filters¶
Filters define which specific objects from the specified source should be included in the projection. This is one of the key optimization mechanisms - proper filters allow you to include only the truly necessary data in the projection, significantly reducing its size.
Filters can be combined to create complex selection conditions. All conditions in filters work on the principle of logical AND - an object is included in the projection only if it satisfies all specified conditions. If no filters are specified (empty object {}), all events from the source will be included in the projection.
Note
The specific fields and operators available for filtering depend on the data source. For a complete list of available filter fields and operators, refer to the API documentation.
Targets (Target Attributes)¶
Targets define which event attributes will be saved in the projection along with descriptors. This is a critically important parameter for optimization - the fewer attributes you include in targets, the more compact the projection will be and the faster the search will work. At the same time, you need to include all attributes that you plan to use during search or return in results.
Important
Any attribute you want to use in index composite fields or return in search results must be included in targets. If an attribute is absent from targets, it won’t be available for filtering during search and won’t appear in results, even if it exists in the source data.
The specific attributes available for targets depend on your data source. Common categories include event identifiers, temporal information, source metadata, object attributes, geolocation data, and custom fields. Choose only the attributes that are necessary for your specific search scenario.
Projection Type¶
The projection type determines how data will be stored and processed in the system. Currently, the view type is supported, which represents a virtual view of the data. When Luna-Vinder components access a view-type projection, data is extracted from the source according to specified filters and targets. This ensures data freshness and flexibility when working with different data slices.
Best Practices¶
When creating projections, follow several important principles to achieve optimal performance and system manageability.
Use maximally specific filters. Don’t include data in the projection that you won’t need during search. Every unnecessary event in the projection increases memory usage in Matcher and slows down search. If you know you’ll only search among events from a specific handler or for a certain time period, specify this in the projection filters. A well-filtered projection will contain much less data than a projection without filters, and searches on it will be significantly faster.
Include only necessary attributes in targets. Each attribute occupies memory space, and the more attributes, the more data needs to be transferred and processed. Analyze which attributes are truly needed for your search scenario and include only those. If you only need to filter by a few specific fields, there’s no point in including numerous other attributes that won’t be used.
Create separate projections for different use cases instead of one large universal projection. For example, it’s better to have a separate projection for searching recent events, a separate one for events from specific sources, and a separate one for analyzing a specific subset of data, rather than one huge projection trying to cover all cases. This simplifies management, improves performance, and makes the system more understandable.
Plan for projection growth. Consider how the projection size will increase over time. If a projection includes all events without time constraints, it will grow indefinitely. Use time-based filters to limit historical data if it’s not needed for search. For example, for real-time search tasks, a projection covering only recent data may be sufficient, which will significantly reduce data volume.
Use clear descriptions in the description field. When the system has dozens of projections, good descriptions help the team understand the purpose of each projection, avoid duplication, and make correct decisions when configuring search. The description should clearly reflect what data the projection contains and what tasks it’s intended for.