Tree Sitter Upgrade #5
@@ -23,13 +23,17 @@ The project uses a multi-layered approach to understand the Skill language:
|
|||||||
### Key Components
|
### Key Components
|
||||||
|
|
||||||
- **`skillls/main.py`**: The entry point of the LSP server. It implements the `LanguageServer` class and contains the handlers for LSP lifecycle events (`initialize`, `didOpen`, `didChange`, etc.) and feature requests (`inlayHint`, `documentSymbol`).
|
- **`skillls/main.py`**: The entry point of the LSP server. It implements the `LanguageServer` class and contains the handlers for LSP lifecycle events (`initialize`, `didOpen`, `didChange`, etc.) and feature requests (`inlayHint`, `documentSymbol`).
|
||||||
- **`skillls/checker.py`**: Contains the logic for syntactic validation, specifically the algorithm for detecting unbalanced parentheses.
|
- **`skillls/parser.py`**: The new Tree-sitter based parser for syntax tree traversal and symbol extraction.
|
||||||
- **`skillls/helpers.py`**: Provides the heavy lifting for text processing, including the content cleaning state machine and the recursive logic for building the node hierarchy.
|
|
||||||
- **`skillls/types.py`**: Defines the internal data models (e.g., `Node`, `URI`) used across the project.
|
- **`skillls/types.py`**: Defines the internal data models (e.g., `Node`, `URI`) used across the project.
|
||||||
|
|
||||||
|
## Roadmap & Engineering Planning
|
||||||
|
|
||||||
|
For details on identified technical debt, fragilities, and the long-term architectural hardening strategy, refer to [PLAN.md](./PLAN.md).
|
||||||
|
|
||||||
## Technical Stack
|
## Technical Stack
|
||||||
|
|
||||||
- **Language**: Python 3.11+
|
- **Language**: Python 3.11+
|
||||||
|
- **Package Management**: `uv`
|
||||||
- **LSP Framework**: `pygls` (Python Language Server)
|
- **LSP Framework**: `pygls` (Python Language Server)
|
||||||
- **Parsing Utilities**: `parsimonious` (PEG parser), `tree-sitter` (for structural tree analysis).
|
- **Parsing Utilities**: `tree-sitter` (for structural tree analysis).
|
||||||
- **Formatting & Tooling**: `rich` (terminal output), `black`, `ruff`, `mypy`.
|
- **Formatting & Tooling**: `rich` (terminal output), `ruff`, `mypy`, `pytest`.
|
||||||
|
|||||||
@@ -0,0 +1,31 @@
|
|||||||
|
# Project Hardening Plan
|
||||||
|
|
||||||
|
This document outlines the identified fragilities in the `skillls` project and the planned architectural improvements to transform it from a functional prototype into a robust, production-ready Language Server.
|
||||||
|
|
||||||
|
## 1. Grammar-Logic Decoupling
|
||||||
|
**Problem**: The `SkillParser` relies on hardcoded string literals (e/g., `"function_definition"`) to identify symbols. Changes in the underlying `tree-sitter-skill` grammar will cause silent failures in the Outline view.
|
||||||
|
**Goal**: Create a stable contract between the grammar and the parser.
|
||||||
|
**Proposed Actions**:
|
||||||
|
- [x] Implement a shared constants module or configuration file that defines significant node types.
|
||||||
|
- [ ] (Long-term) Explore using Tree-sitter Queries (`Query` API) to match patterns instead of manual type checking, making the parser less dependent on specific node names and more focused on structural patterns.
|
||||||
|
|
||||||
|
## 2. Iterative AST Traversal
|
||||||
|
**Problem**: The current recursive traversal in `_traverse_tree` is susceptible to `RecursionError` on deeply nested files.
|
||||||
|
**Goal**: Ensure the server can handle arbitrarily deep syntax trees without crashing.
|
||||||
|
**Proposed Actions**:
|
||||||
|
- [ ] Refactor `SkillParser._traverse_tree` to use an iterative approach (using a stack/deque) instead of recursion.
|
||||||
|
|
||||||
|
## s3. Single Source of Truth for Errors
|
||||||
|
**Problem**: The project is in a transitional state where error management is split between the new `SkillParser` diagnostics and the legacy `server.errs` dictionary in `main.py`.
|
||||||
|
**Goal**: Unify error reporting into a single, streamlined pipeline.
|
||||||
|
**Proposed Actions**:
|
||||||
|
- [ ] Complete the refactor of `skillls/main.py`.
|
||||||
|
- [ ] Remove the `errs` dictionary from `SkillLanguageServer`.
|
||||||
|
- [ ] Decommission and delete deprecated files: `skillls/checker.py` and unused parts of `skillls/helpers.py`.
|
||||||
|
|
||||||
|
## 4. Dependency Management Stabilization
|
||||||
|
**Problem**: The dependency on a private SSH Git URL for `tree-sitter-skill` introduces external failure points into the build pipeline.
|
||||||
|
**Goal**: Stabilize the build environment.
|
||||||
|
**Proposed Actions**:
|
||||||
|
- [ ] Evaluate the feasibility of publishing `tree-sitter-skill` to a private PyPI registry or a more accessible artifact repository.
|
||||||
|
- [ ] Implement a fallback/vendoring strategy for critical grammar components if possible.
|
||||||
@@ -0,0 +1,19 @@
|
|||||||
|
"""
|
||||||
|
Centralized constants for the Skill language parser and LSP server.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from typing import Final, Set
|
||||||
|
|
||||||
|
# Node types that represent syntax errors in Tree-sitter
|
||||||
|
ERROR_NODE_TYPES: Final[Set[str]] = {"ERROR", "MISSING"}
|
||||||
|
|
||||||
|
# Node types that are considered significant enough to appear in the Document Symbol outline
|
||||||
|
SYMBOLIC_NODE_TYPES: Final[Set[str]] = {
|
||||||
|
"function_definition",
|
||||||
|
"procedure_definition",
|
||||||
|
"namespace",
|
||||||
|
"let_binding",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Node types used to identify names/identifiers within symbolic nodes
|
||||||
|
IDENTIFIER_NODE_TYPES: Final[Set[str]] = {"identifier", "name"}
|
||||||
+4
-4
@@ -9,6 +9,7 @@ from lsprotocol.types import (
|
|||||||
SymbolKind,
|
SymbolKind,
|
||||||
)
|
)
|
||||||
from pygls.workspace import TextDocument
|
from pygls.workspace import TextDocument
|
||||||
|
from skillls.constants import ERROR_NODE_TYPES, IDENTIFIER_NODE_TYPES, SYMBOLIC_NODE_TYPES
|
||||||
|
|
||||||
class SkillParser:
|
class SkillParser:
|
||||||
"""
|
"""
|
||||||
@@ -51,7 +52,7 @@ class SkillParser:
|
|||||||
"""Recursively traverses the AST to find errors and symbols."""
|
"""Recursively traverses the AST to find errors and symbols."""
|
||||||
|
|
||||||
# 1. Handle Errors (Diagnostics)
|
# 1. Handle Errors (Diagnostics)
|
||||||
if node.type == "ERROR" or node.type == "MISSING":
|
if node.type in ERROR_NODE_TYPES:
|
||||||
start_point = node.start_point
|
start_point = node.start_point
|
||||||
end_point = node.end_point
|
end_point = node.end_point
|
||||||
|
|
||||||
@@ -78,14 +79,13 @@ class SkillParser:
|
|||||||
|
|
||||||
def _is_symbol_node(self, node) -> bool:
|
def _is_symbol_node(self, node) -> bool:
|
||||||
"""Determines if a node is significant enough to be an outline symbol."""
|
"""Determines if a node is significant enough to be an outline symbol."""
|
||||||
symbolic_types = {"function_definition", "procedure_definition", "namespace", "let_binding"}
|
return node.type in SYMBOLIC_NODE_TYPES or node.type.endswith("_def")
|
||||||
return node.type in symbolic_types or node.type.endswith("_def")
|
|
||||||
|
|
||||||
def _create_document_symbol(self, node, content: str) -> DocumentSymbol | None:
|
def _create_document_symbol(self, node, content: str) -> DocumentSymbol | None:
|
||||||
"""Extracts a name and range for an AST node to create an LSP symbol."""
|
"""Extracts a name and range for an AST node to create an LSP symbol."""
|
||||||
name = None
|
name = None
|
||||||
for child in node.children:
|
for child in node.children:
|
||||||
if child.type == "identifier" or child.type == "name":
|
if child.type in IDENTIFIER_NODE_TYPES:
|
||||||
start_byte = child.start_byte
|
start_byte = child.start_byte
|
||||||
end_byte = child.end_byte
|
end_byte = child.end_byte
|
||||||
name = content[start_byte:end_byte]
|
name = content[start_byte:end_byte]
|
||||||
|
|||||||
Reference in New Issue
Block a user