Files
astronomy/generate/check_internal_links.py
Don Cross c86445ccce Mac fix: eliminate 'realpath' from makedoc script.
The program 'realpath' does not come installed on Mac OS.
This caused the bash script 'makedoc' to fail on Mac.
The only place I used realpath was to convert relative
paths to absolute paths for filenames passed to
check_internal_links.py.

It turns out Python has a standard function os.path.realpath()
that does the same thing, so I moved the logic into the
Python script itself. Thus makedoc no longer needs the
realpath program, and the Python function will work on
all platforms.

There is a general lesson here: in the future I will
consider moving more of my scripting logic into Python.
It has proven to be more portable than a mixture
of bash scripts and Windows batch files.
2021-12-30 11:08:09 -05:00

50 lines
1.6 KiB
Python
Executable File

#!/usr/bin/env python3
import sys
import re
import os
def FindBrokenLinks(text):
# Search for all link names, of the form: (#Some.Name)
linkSet = set(m.group(1) for m in re.finditer(r'\(#([A-Za-z0-9_\.]+)\)', text))
# Search for all anchor names, of the form: <a name="Some.Name">
anchorSet = set(m.group(1) for m in re.finditer(r'<\s*a\s+name\s*=\s*"([A-Za-z0-9_\.]+)"\s*>', text))
# Find all link names for which there is no matching anchor name.
return sorted(linkSet - anchorSet)
def FindBogusLinks(text):
# Search for bogus links of the form [Symbol](Symbol).
bogusSet = set()
for m in re.finditer(r'\[([A-Za-z0-9_\.]+)\]\(([A-Za-z0-9_\.]+)\)', text):
if m.group(1) == m.group(2):
bogusSet.add(m.group(1))
return bogusSet
if __name__ == '__main__':
if len(sys.argv) != 2:
print('USAGE: check_internal_links.py infile.md')
sys.exit(1)
rc = 0
filename = os.path.realpath(sys.argv[1])
with open(filename, 'rt') as infile:
text = infile.read()
badLinks = FindBrokenLinks(text)
if len(badLinks) > 0:
rc = 1
print('ERROR(check_internal_links.py): The following {} links are bad in file {}'.format(len(badLinks), filename))
for name in badLinks:
print(name)
bogusLinks = FindBogusLinks(text)
if len(bogusLinks) > 0:
rc = 1
print('ERROR(check_internal_links.py): The following {} links are of the form "(Symbol)[Symbol]" in file {}'.format(len(bogusLinks), filename))
for name in bogusLinks:
print(name)
if rc == 0:
print('Links are OK in: {}'.format(filename))
sys.exit(rc)