Notebook and Module Parsing

Parse notebook metadata, content, and extract function/class signatures with docments

Data Models for Parsing


source

FunctionInfo

 FunctionInfo (name:str, signature:str, docstring:Optional[str]=None,
               decorators:List[str]=<factory>, is_exported:bool=False,
               is_async:bool=False, source_line:Optional[int]=None)

Information about a function


source

VariableInfo

 VariableInfo (name:str, value:Optional[str]=None,
               type_hint:Optional[str]=None, comment:Optional[str]=None,
               is_exported:bool=False)

Information about a module-level variable


source

ClassInfo

 ClassInfo (name:str, signature:str, docstring:Optional[str]=None,
            methods:List[__main__.FunctionInfo]=<factory>,
            decorators:List[str]=<factory>,
            attributes:List[__main__.VariableInfo]=<factory>,
            is_exported:bool=False, source_line:Optional[int]=None)

Information about a class


source

ModuleInfo

 ModuleInfo (path:pathlib.Path, name:str, title:Optional[str]=None,
             description:Optional[str]=None,
             functions:List[__main__.FunctionInfo]=<factory>,
             classes:List[__main__.ClassInfo]=<factory>,
             variables:List[__main__.VariableInfo]=<factory>,
             imports:List[str]=<factory>)

Information about a module (notebook or Python file)

AST Parsing Utilities


source

extract_docments_signature

 extract_docments_signature
                             (node:Union[ast.FunctionDef,ast.AsyncFunction
                             Def], source_lines:List[str])

Extract function signature with docments-style comments

Type Details
node Union AST function node
source_lines List Source code lines
Returns str Function signature

source

parse_function

 parse_function (node:Union[ast.FunctionDef,ast.AsyncFunctionDef],
                 source_lines:List[str], is_exported:bool=False)

Parse a function definition from AST

Type Default Details
node Union AST function node
source_lines List Source code lines
is_exported bool False Has #| export
Returns FunctionInfo Function information

source

parse_class

 parse_class (node:ast.ClassDef, source_lines:List[str],
              is_exported:bool=False)

Parse a class definition from AST

Type Default Details
node ClassDef AST class node
source_lines List Source code lines
is_exported bool False Has #| export
Returns ClassInfo Class information

source

parse_variable

 parse_variable (node:Union[ast.Assign,ast.AnnAssign],
                 source_lines:List[str], is_exported:bool=False)

Parse variable assignments from AST

Type Default Details
node Union AST assignment node
source_lines List Source code lines
is_exported bool False Has #| export
Returns List Variable information

Notebook Cell Parsing


source

parse_code_cell

 parse_code_cell (cell:Dict[str,Any])

Parse a notebook code cell for functions, classes, variables, and imports

Type Details
cell Dict Notebook code cell
Returns Tuple (functions, classes, variables, imports)

Module Parsing


source

parse_notebook

 parse_notebook (path:pathlib.Path)

Parse a notebook file for module information

Type Details
path Path Path to notebook
Returns ModuleInfo Module information

source

parse_python_file

 parse_python_file (path:pathlib.Path)

Parse a Python file for module information

Type Details
path Path Path to Python file
Returns ModuleInfo Module information

Testing

Let’s test the parser on our own notebooks:

# Test parsing the core module
core_info = parse_notebook(Path("core.ipynb"))
print(f"Module: {core_info.name}")
print(f"Title: {core_info.title}")
print(f"Description: {core_info.description}")
print(f"\nFunctions ({len(core_info.functions)}):")
for func in core_info.functions[:3]:  # Show first 3
    print(f"  - {func.name}")
print(f"\nClasses ({len(core_info.classes)}):")
for cls in core_info.classes[:3]:  # Show first 3
    print(f"  - {cls.name}")
    
print(f"\nTesting refactored parse_class function...")
print("Class signatures:")
for cls in core_info.classes:
    print(f"\n{cls.name}:")
    print(f"  Decorators: {cls.decorators}")
    print(f"  Methods: {[m.name for m in cls.methods]}")
    print(f"  Attributes: {[a.name for a in cls.attributes]}")
    print(f"  Signature: {cls.signature[:100]}...")  # First 100 chars
Module: core
Title: Core Utilities
Description: Core utilities and data models for nbdev project overview generation

Functions (4):
  - get_notebook_files
  - get_subdirectories
  - read_notebook

Classes (2):
  - NotebookInfo
  - DirectoryInfo

Testing refactored parse_class function...
Class signatures:

NotebookInfo:
  Decorators: ['dataclass']
  Methods: ['relative_path']
  Attributes: ['path', 'name', 'title', 'description', 'export_module']
  Signature: class NotebookInfo:...

DirectoryInfo:
  Decorators: ['dataclass']
  Methods: ['total_notebook_count']
  Attributes: ['path', 'name', 'notebook_count', 'description', 'subdirs', 'notebooks']
  Signature: class DirectoryInfo:...
# Test extracting function signatures
print("Function signatures with docments:")
for func in core_info.functions[:2]:
    print(f"\n{func.name}:")
    print(func.signature)
Function signatures with docments:

get_notebook_files:
def get_notebook_files(path: Path = None,           # Directory to search (defaults to nbs_path)
                      recursive: bool = True        # Search subdirectories
                      ) -> List[Path]:              # List of notebook paths

get_subdirectories:
def get_subdirectories(path: Path = None,           # Directory to search (defaults to nbs_path)
                      recursive: bool = False       # Include all nested subdirectories
                      ) -> List[Path]:              # List of directory paths