Module ScanFile

Module ScanFile 

Source
Expand description

§ScanFile

§File: Indexing/Scan/ScanFile.rs

§Role in Air Architecture

Provides individual file scanning functionality for the File Indexer service, handling reading, metadata extraction, and categorization of files for indexing.

§Primary Responsibility

Scan individual files to extract metadata, content, and prepare them for indexing operations.

§Secondary Responsibilities

  • File access validation and permission checking
  • Encoding detection for text files
  • Language detection for code files
  • File size validation
  • Symbolic link detection

§Dependencies

External Crates:

  • tokio - Async file I/O operations
  • sha2 - Checksum calculation for file integrity

Internal Modules:

  • crate::Result - Error handling type
  • crate::AirError - Error types
  • crate::Configuration::IndexingConfig - Indexing configuration
  • super::super::State::CreateState - State structure definitions
  • super::Process::ProcessContent - Content processing operations

§Dependents

  • Indexing::Scan::ScanDirectory - Batch file processing
  • Indexing::Watch::WatchFile - Individual file change handling
  • Indexing::mod::FileIndexer - Main file indexer implementation

§VSCode Pattern Reference

Inspired by VSCode’s file scanning in src/vs/workbench/services/files/

§Security Considerations

  • Path canonicalization before access
  • File size limits enforced
  • Timeout protection for I/O operations
  • Permission checking before reads

§Performance Considerations

  • Asynchronous file reading
  • Batch processing operations
  • Memory-efficient streaming for large files
  • Cached metadata when available

§Error Handling Strategy

File scanning returns Results with detailed error messages about why a file cannot be scanned or accessed. Errors are logged and individual file failures don’t halt batch operations.

§Thread Safety

File scanning operations are designed for parallel execution and

Functions§

CalculateChecksum
Calculate SHA-256 checksum for file content
FileModifiedSince
Check if file has been modified since last indexed
GetFileSize
Get file size with error handling
GetPermissionsString
Get file permissions as string
IndexFileInternal
Index a single file internally with comprehensive validation
IsBinaryFile
Check if file is binary (not suitable for indexing)
IsTextFile
Check if file is text-based (likely to be code or documentation)
ScanFileMetadata
Scan file and return just the metadata (without symbols)
ValidateFileAccess
Validate file access and permissions before scanning