Home
Note to Readerο
The following document as well as the Documentation and Code Comments pages are intended for developers. A user who doesn't intend to contribute to development might find the User Guide page more helpful.
If you discover an issue with this repository or have a question, please feel free to open an issue. I've included templates for the following issues:
- ποΈ Spelling and Grammar: Found some language that is incorrect?
- π€· Clarity: Found a section that just makes no sense?
- β Question: Do you have a general question?
- π Bug: Found an error in the code?
- π Enhancement: Have a suggestion for improving the toolchain?
π Cite Meο
BibTeX and APA on the right sidebar of GitHub.
βοΈ Licenseο
GNU GPL v3
Planning and Administrationο
Tasksο
Tasks are tracked as GitHub issues, each Enhancement and Bug generating the following collection
of issues and child issues:
- A primary issue describing the goal:
- A documentation child issue.
- An implementation child issue.
- A validation child issue.
Version controlο
The toolchain shall be kept under Git versioning. Development shall take place on branches with
main on GitHub as a source of truth. GitHub pull requests shall serve as the arbiter for inclusion
on main with the following quality gates:
- Compiling of source code.
- Running and passing the unit test suite.
- Running and passing linting and style enforcers.
- Successful generation of documentation.
Release Taggingο
The project shall be tagged when an Enhancement or Bug issue is merged into main. The tag shall
follow semantic versioning for labels.
Project Structureο
π .
βββ π docs
β βββ π README.md
βββ π libraries
β βββ π <libraries>
β βββ π οΈ CMakeLists.txt
β βββ π Findlizard
βββ π languages
β βββ <language definitions>
βββ π source
β βββ π <library>
β βββ π test
β β βββ π¨test_<>.c
β β βββ π οΈ CMakeLists.txt
β βββ πsrc
β β βββ π¨<>.c
β β βββ π<>.h
β βββ π docs
β β βββ π media
β β βββ π index.md
β β βββ π unit-description.md
β βββ π οΈ CMakeLists.txt
β βββ π mkdocs.yml
βββ π wrappers
β βββ π <wrapper>
β βββ π test
β β βββ π¨test_<>.c
β β βββ π οΈ CMakeLists.txt
β βββ πsrc
β β βββ π¨<>.c
β β βββ π<>.h
β βββ π docs
β β βββ π media
β β βββ π index.md
β β βββ π unit-description.md
β βββ π οΈ CMakeLists.txt
β βββ π mkdocs.yml
βββ π CITATION
βββ π οΈ CMakeLists.txt
βββ βοΈ flake.lock
βββ βοΈ flake.nix
βββ π Justfile
βββ π LICENSE
βββ π mkdocs.yml
βββ π requirements.txt
Directories of Interestο
- Source: This directory contains the C libraries for the PDGL.
- Wrappers: This directory contains the C executable wrappers for the PDGL.
- Docs: This directory contains the high level documentation for the PDGL.
- Languages: This directory contains language definitions for the PDGL.
Define a Unitο
A unit in this project shall be defined as a header file for a C library module.
Qualityο
The PDGL and its units shall fail-safe, that is the PDGL and its units can fail, but the failure must be detectable.
Unit Testingο
Each C module shall be unit tested. Lower level components may or may not be mocked for higher level components.
Integration Testingο
No integration test is expected for the library code. Integration tests are expected to be carried out by wrappers.
Requirementsο
The PDGL reimplements portions of the original DGL by Maurer 1 (source is
available on Dr. Maurer's personal website and mirrored on
GitHub). The original DGL consumes a
language definition for a grammar (usually context free) and produces a compilable .c source file.
This workflow is a little cumbersome in practice. The PDGL intends to implement a set of portable
libraries that consume a language definition and directly produce words of that language. To that
end the PDGL shall match the features and use cases of the original DGL where possible. Some
features may be hard or impossible to reproduce with a modular design. The PDGL shall forgo the
DGL language itself in favor of definitions of languages in TOML.
DGL Vs. PDGL Feature Matrixο
| Symbol | Support Level |
|---|---|
| Full Support | |
| Partial Support | |
| Support Planned | |
| Unsupported |
| DGL Production Type | Support | PDGL Production Type or Implementation Difficulty |
|---|---|---|
| "Unweighted production" | Pure Production | |
| "Weighted production" | Weighted Production | |
"Character Range production [a-z]" |
Can be reproduced with a list in a pure production. |
Pure Production |
| "Arithmetic Productions" | Hard for native arithmetic productions, as modeling the storage of a global variable is a pain point that requires thought. | |
| action | PDGL doesn't maintain state |
Janet Production Adding the ability to maintain state of a production is easy. Having state be scoped to a language is hard. |
| range | Range Production | |
| counter | Easy, some care to be taken in the termination case. | |
| unique | Easy, some care to be taken in the overflow and termination case. | |
| chain | Hard, I can't see how to do this without a language scope state context. | |
| double | Easy, straight forward extension of a range production. | |
| permutation | Hard, I can't see how to do this without a language scope state context. | |
| sequence | Easy, straight forward extension of a pure production. |
Functional Requirementsο
Use Casesο
Functional requirements for the toolchain are phrased as use cases. The following use case diagram models the interdependence of those use cases.
flowchart LR
aU["π€ User"]
aS["π€ PDGL"]
subgraph wrap [Wrappers]
subgraph cli [CLI]
egolfcl(["Execute Generation of Language From Command Line"])
end
subgraph wasm [WASM]
egolfb(["Execute Generation of Language From Browser"])
end
end
subgraph lib [Libraries]
SLS(["Supply Language Specification"])
LL(["Load Language Specification"])
LSWD(["Language Specification is Well-defined"])
LSI(["Log State Information"])
EGL(["Execute Generation of Language"])
EMG(["Execute Multiple Generations"])
RPGS(["Report Portion of Generation String"])
EP(["Execute Production"])
PPtS(["Push Production to Stack"])
PPfS(["Pop Production from Stack"])
EPP(["Execute Pure Production"])
EWP(["Execute Weighted Production"])
EJP(["Execute Janet Production"])
ERP(["Execute Range Production"])
FTP(["Force Terminate Production"])
EX(["Force Exit"])
SG(["Stop Generation"])
scr[("Stdout")]
sti[("Stdin")]
err[("Stderr")]
sta[("Resolution Stack")]
SG -. include .-> FTP
SG -. include .-> EX
SLS -. include .-> LSWD
SLS -. include .-> LL
EMG -. include .-> EGL
EGL -. include .-> EP
EP -. include .-> FTP
FTP -. include .-> RPGS
EP -. include .-> PPtS
EP -. include .-> PPfS
EP -. include .-> EPP
EP -. include .-> EWP
EP -. include .-> ERP
EP -. include .-> EJP
EP -. include .-> RPGS
SLS -. uses .-> sti
LSI -. uses .-> err
RPGS -. uses .-> scr
PPtS -. uses .-> sta
PPfS -. uses .-> sta
end
aS --> LSI
aS --> SG
aU --> SLS
aU --> EGL
aU --> EMG
aU --> egolfcl
aU --> egolfb
Phase 1:
- Execute Multiple Generations
- Execute Generation of Language
- Execute Production
- Report Portion of Generation String
- Push Production to Stack
- Pop Production from Stack
- Execute Pure Production
- Execute Janet Production
- Execute Range Production
- Execute Weighted Production
- Supply Language Specification
- Load Language Specification
- Language Specification is Well-defined
- Stop Generation
- Force Exit
- Force Terminate Production
Phase 2:
Architectural Decisionsο
For the PDGL libraries the use cases should be sufficient to motivate and document behavior. When this is insufficient to document specific architectural decisions a collection of [MADR]2 (https://github.com/adr/madr) should be used to document the decisions.
Wrappers may reference system use cases and define their use cases. However, [MADR]2 (https://github.com/adr/madr) should serve as the primary documentation for the architecture of a wrapper.
Nonfunctional Requirementsο
Colorsο
Diagrams included in documentation for features (use case and unit descriptions) are expected to use the COLORS color palette.
Technologiesο
Languages and Frameworksο
The PDGL and its components shall be written in C using clang for compiling and CMake as a build system. By design the entry point shall be decoupled from core functionality. These are expected to be compilable with various tooling including C/C++, Python, and JavaScript. This requires all 'external' interfaces to be C++ linkable.
Unit testing of runnable and data wrangler libraries will use the Unity and CMock libraries for unit testing. Test indexing is handled by CTest.
Tools:
- git
- mermaid.js
- Unity
- clang
- CMake
- CTest
- Doxygen
- CMock
- Python
- mkdocs
- Pytest
- prek
- valgrind
- tombi
- uncrustify
- rumdl
- MADR2
Documentation of Implementationο
C/C++ code is documented with Doxygen, the Doxygen comments shall be parsed and output as XML. General documentation shall be recorded as Markdown files in each module's monorepo. Documentation shall be aggregated using the mkdoxy framework.
Code Style Guideο
The C/C++ code in this repository shall be formatted by the bundled uncrustify configuration.
-
Peter M. Maurer. DGL Version 2 β Random Testing in the Mobile Computing Era. In Henry Han and Erich Baker, editors, Next Generation Data Science, volume 2113, pages 172β183. Springer Nature Switzerland, 2024. URL: https://link.springer.com/10.1007/978-3-031-61816-1_12 (visited on 2026-01-16), doi:10.1007/978-3-031-61816-1_12. ↩
-
Oliver Kopp, Anita Armbruster, and Olaf Zimmermann. Markdown architectural decision records: format and tool support. In ZEUS. 2018. ↩↩↩