Files
pdfme/packages/pdf-lib/__tests__/utils/pdfDocEncoding.spec.ts
devin-ai-integration[bot] e4a4c300cd Migrate pdf-lib into pdfme monorepo (#1059)
* Migrate pdf-lib into pdfme monorepo

- Add @pdfme/pdf-lib package to packages/ directory
- Update root package.json to include pdf-lib in workspaces
- Update all package dependencies to use workspace:* for @pdfme/pdf-lib
- Configure TypeScript build targets (cjs, esm, node) for pdf-lib
- Add ESLint configuration with relaxed rules for pdf-lib migration
- Integrate pdf-lib into monorepo build and clean scripts
- Add basic test suite for pdf-lib package
- All lint, build, and test suites pass successfully

This migration improves maintainability by consolidating all PDF operations
into a single repository and unified build/test/release process.

Co-Authored-By: Kyohei Fukuda <kyoheif@wix.com>

* Fix TypeScript module resolution for workspace dependencies

- Changed moduleResolution from 'bundler' to 'node' in common package
- This should resolve '@pdfme/pdf-lib' module resolution issues
- Reverted workspace dependency format back to '*' for npm compatibility

Co-Authored-By: Kyohei Fukuda <kyoheif@wix.com>

* Fix pdf-lib package.json exports paths

- Updated main, module, and exports paths to point to correct locations
- Changed from dist/*/index.js to dist/*/src/index.js to match build output
- Fixed TypeScript types path from dist/types/index.d.ts to dist/types/src/index.d.ts
- Resolves Vite package entry resolution errors and TypeScript module resolution issues

Co-Authored-By: Kyohei Fukuda <kyoheif@wix.com>

* Fix CodeQL security alerts in svg.ts

- Add input validation and sanitization for HTML/SVG parsing
- Prevent ReDoS attacks with regex limits and input size checks
- Sanitize font family names to prevent prototype pollution
- Add URL validation for image sources to prevent path traversal
- Limit transformation parsing to prevent infinite loops
- Maintain backward compatibility while improving security

Co-Authored-By: Kyohei Fukuda <kyoheif@wix.com>

* Implement comprehensive security fixes for CodeQL alerts in svg.ts

- Add input validation and sanitization for SVG content
- Implement safe HTML parsing with null checks and size limits
- Add controlled dynamic property access with allowlisted tag names
- Prevent style injection with filtered and limited style entries
- Add regex match limits to prevent ReDoS attacks
- Enhance font selection with input validation and type safety
- Sanitize image sources to prevent path traversal and injection
- Limit CSS style parsing to prevent potential vulnerabilities

These changes address the 2 high-severity CodeQL security alerts while
maintaining backward compatibility and functionality.

Co-Authored-By: Kyohei Fukuda <kyoheif@wix.com>

* Add additional security fixes for CodeQL alerts in svg.ts

- Implement safer property access for polygon node transformation
- Add input validation for points attribute with regex pattern matching
- Replace Object.assign with safer property assignment to prevent prototype pollution
- Add null checks and type validation for node attributes and childNodes
- Implement safer SVG node parsing with comprehensive validation
- Add array type checks for childNodes processing

These changes target the remaining 2 high-severity CodeQL security alerts
by addressing potential prototype pollution and unsafe property access.

Co-Authored-By: Kyohei Fukuda <kyoheif@wix.com>

* Implement comprehensive security hardening for CodeQL alerts in svg.ts

- Add comprehensive SVG content sanitization with allowlist-based tag filtering
- Implement strict input validation with bounds checking for all numeric inputs
- Replace unsafe dynamic property assignment with Object.defineProperty
- Add try-catch error handling for HTML parsing operations
- Restrict allowed style properties and validate string lengths
- Use setAttribute/removeAttribute instead of direct attribute manipulation
- Add type safety checks for all node operations
- Implement safer polygon-to-path conversion with validation

These changes address the 10 high-severity CodeQL security alerts by:
1. Preventing XSS through comprehensive input sanitization
2. Avoiding prototype pollution with safer property assignment
3. Adding bounds checking to prevent DoS attacks
4. Using allowlist-based validation for all user inputs
5. Implementing proper error handling to prevent crashes

Co-Authored-By: Kyohei Fukuda <kyoheif@wix.com>

* Potential fix for code scanning alert no. 32: Incomplete multi-character sanitization

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Potential fix for code scanning alert no. 39: Incomplete multi-character sanitization

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Fix inefficient regular expression in svg.ts to pass CodeQL

- Changed /([^:\s]+)*\s*:\s*([^;]+)/g to /([^:\s]+)\s*:\s*([^;]+)/g
- Removed the problematic * quantifier that could cause exponential backtracking
- This fixes the "Inefficient regular expression" security alert from GitHub Advanced Security

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* remove sanitize-html

* move tests

* fix for security

* update dependabot.yml

* organize

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Kyohei Fukuda <kyouhei.fukuda0729@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-06-26 18:30:05 +09:00

83 lines
2.2 KiB
TypeScript
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
import { range, pdfDocEncodingDecode } from '../../src/utils';
type Mapping = [number, string];
const identityMapping = (code: number): Mapping => [
code,
String.fromCodePoint(code),
];
// Define mappings (see "Table D.2 PDFDocEncoding Character Set" of the PDF spec)
const mappings: Mapping[] = [
...range(0x00, 0x15 + 1).map(identityMapping),
[0x16, '\u0017'],
[0x17, '\u0017'],
[0x18, '\u02D8'],
[0x19, '\u02C7'],
[0x1a, '\u02C6'],
[0x1b, '\u02D9'],
[0x1c, '\u02DD'],
[0x1d, '\u02DB'],
[0x1e, '\u02DA'],
[0x1f, '\u02DC'],
...range(0x20, 0x7e + 1).map(identityMapping),
[0x7f, '\uFFFD'],
[0x80, '\u2022'],
[0x81, '\u2020'],
[0x82, '\u2021'],
[0x83, '\u2026'],
[0x84, '\u2014'],
[0x85, '\u2013'],
[0x86, '\u0192'],
[0x87, '\u2044'],
[0x88, '\u2039'],
[0x89, '\u203A'],
[0x8a, '\u2212'],
[0x8b, '\u2030'],
[0x8c, '\u201E'],
[0x8d, '\u201C'],
[0x8e, '\u201D'],
[0x8f, '\u2018'],
[0x90, '\u2019'],
[0x91, '\u201A'],
[0x92, '\u2122'],
[0x93, '\uFB01'],
[0x94, '\uFB02'],
[0x95, '\u0141'],
[0x96, '\u0152'],
[0x97, '\u0160'],
[0x98, '\u0178'],
[0x99, '\u017D'],
[0x9a, '\u0131'],
[0x9b, '\u0142'],
[0x9c, '\u0153'],
[0x9d, '\u0161'],
[0x9e, '\u017E'],
[0x9f, '\uFFFD'],
[0xa0, '\u20AC'],
...range(0xa1, 0xac + 1).map(identityMapping),
[0xad, '\uFFFD'],
...range(0xae, 0xff + 1).map(identityMapping),
];
describe(`pdfDocEncodingDecode`, () => {
it(`maps all PDFDocEncoding codes from 0-255 to the correct Unicode code points`, () => {
// Make sure we have defined mappings for all codes from 0-255
expect(mappings.map(([code]) => code).sort((a, b) => a - b)).toEqual(
range(0, 256),
);
// Now make sure that `pdfDocEncodingDecode` decodes everything correctly
mappings.forEach(([input1, expected1]) => {
const actual1 = pdfDocEncodingDecode(Uint8Array.of(input1));
expect(actual1).toBe(expected1);
});
// Let's do it again but all at once instead of passing each code separately
const input2 = Uint8Array.from(mappings.map(([code]) => code));
const expected2 = mappings.map(([, str]) => str).join('');
const actual2 = pdfDocEncodingDecode(input2);
expect(actual2).toEqual(expected2);
});
});