"Converting a standard grammar with recursion and alternations into a regular expression: A step-by-step

Question

"Converting a standard grammar with recursion and alternations into a regular expression: A step-by-step

A grammar is considered regular if it follows either a right-linear or left-linear pattern. According to this tutorial, this type of grammar possesses a unique property:

Regular grammars have a special characteristic: through the substitution of every nonterminal (excluding the root) with its corresponding righthand side, the grammar can be simplified to a single production for the root, containing only terminals and operators on the right-hand side... The resulting expression comprising terminals and operators can be further condensed into a more concise form known as a regular expression.

In an attempt to explore this concept further, I decided to convert the regular EcmaScript grammar for IdentifierName into regular expressions:

IdentifierName ::
    IdentifierStart
    IdentifierName  IdentifierPart

Let's assume that the definitions for IdentifierStart and IdentifierPart are limited to the following:

IdentifierStart ::       IdentifierPart ::
    A                        A                 
    B                        C
    C                        &
    $                    
    _

However, I'm facing some confusion in proceeding with this task due to the presence of both recursion and alternation within the grammar for IdentifierName. Any suggestions or assistance?

My main focus lies on understanding the methodology rather than solely obtaining the resulting regexp, which has been demonstrated by @Bergi as [ABC$_][AC&]*.

javascript ecmascript-next compiler-construction

Answer 1

Answer №1

The tutorial referenced here introduces unconventional definitions in its explanation.

Instead of adhering to the standard definition of a regular grammar as one that is either left-linear or right-linear, the tutorial opts for a model based on repetition operators akin to those seen in regular expressions or EBNF. Under this framework, a grammar is considered regular if it solely employs these repetition operators without recursion. Consequently, converting such a "regular grammar" into a regex involves simply substituting non-terminals with their corresponding definitions. However, according to this non-traditional viewpoint, the JavaScript specification's grammar for identifiers falls short of being classified as regular due to its recursive elements necessitating a preliminary substitution process.

This departure from convention raises concerns regarding the validity and practicality of the definitions presented. While regular grammars can indeed be transformed into regular expressions, the methodology outlined in the tutorial may not be universally applicable. A more robust approach involves converting the grammar into a finite automaton before utilizing established algorithms for conversion.

In practice, manually performing this conversion often entails examining the language described by the grammar (e.g., "words beginning with an IdentifierStart symbol followed by zero or more IdentifierPart symbols") and crafting a regular expression accordingly. This intuitive method, sometimes referred to as the "look really hard at the problem until you see the solution"-algorithm, remains a prevalent strategy in manual conversions despite its theoretical simplicity.

Answer 2

The tutorial referenced here introduces unconventional definitions in its explanation.

Instead of adhering to the standard definition of a regular grammar as one that is either left-linear or right-linear, the tutorial opts for a model based on repetition operators akin to those seen in regular expressions or EBNF. Under this framework, a grammar is considered regular if it solely employs these repetition operators without recursion. Consequently, converting such a "regular grammar" into a regex involves simply substituting non-terminals with their corresponding definitions. However, according to this non-traditional viewpoint, the JavaScript specification's grammar for identifiers falls short of being classified as regular due to its recursive elements necessitating a preliminary substitution process.

This departure from convention raises concerns regarding the validity and practicality of the definitions presented. While regular grammars can indeed be transformed into regular expressions, the methodology outlined in the tutorial may not be universally applicable. A more robust approach involves converting the grammar into a finite automaton before utilizing established algorithms for conversion.

In practice, manually performing this conversion often entails examining the language described by the grammar (e.g., "words beginning with an IdentifierStart symbol followed by zero or more IdentifierPart symbols") and crafting a regular expression accordingly. This intuitive method, sometimes referred to as the "look really hard at the problem until you see the solution"-algorithm, remains a prevalent strategy in manual conversions despite its theoretical simplicity.

"Converting a standard grammar with recursion and alternations into a regular expression: A step-by-step

Answer №1

Similar questions

Adjust the dimensions of the bootstrap dropdown to match the dimensions of its textbox

Using PHP to iterate through an array and output the values within a

What strategies can I use to reduce the amount of event listeners assigned to buttons in jquery?

Removing a faded out div with Vanilla JavaScript

Omitting a specific item from the OrderBy function in an AngularJS table

What is the best way to create a fully clickable navbar item for the Bootstrap dropdown feature?

Issue encountered: Inability to implement asynchronous functionality within a forEach loop while making an API request

What is causing the Access-Control-Allow-Origin error when using axios?

Is three too much for the Javascript switch statement to handle?

Uploading images simultaneously while filling out a form

An error arises when using the command window.close()

Obtaining the calculated background style on Firefox

Transitioning NodeJS from local development to a live website

Positives and negatives images for accordion menu

Incorporate Live Data into Google Charts Using Ajax Response for a Dynamic Visualization

Issue encountered with Cheerio while using Node.js

Error message: The Slick Carousal encountered an unexpected problem - TypeError:undefined is not a function

Experiencing an unexpected wait before the requestAnimationFrame?

The fixed positioned div with jQuery disappears when scrolling in Firefox but functions properly in Chrome, IE, and Safari

Encountering difficulties in compiling Dynamic HTML with the $compile function