Commit Graph

68 Commits

Author SHA1 Message Date
Micah Snyder
201e1b12a7 XOR test files; clean up tests directory
The split test files are flagged by some AV's because they look like
broken executables. Instead of splitting the test files to prevent
detections, we should encrypt them. This commit replaces the "reassemble
testfiles" script with a basic "XOR testfiles" script that can be used
to encrypt or decrypt test files. This commit also of course then
replaces all the split files with xor'ed files.

The test and unit_tests directories were a bit of a mess, so I
reorganized them all into unit_tests with all of the test files placed
under "unit_tests/input" using subdirectories for different types of files.
2021-07-17 10:39:27 -07:00
Micah Snyder (micasnyd)
b9ca6ea103 Update copyright dates for 2021
Also fixes up clang-format.
2021-03-19 15:12:26 -07:00
Micah Snyder
afbf0b6180 Fix Windows text file EOL conversion issues
On Windows, files open()'ed without the O_BINARY flag will have new-line
LF (aka \n) converted to CRLF (aka \r\n) automatically when read from or
written to. This is undesirable for all scan targets AND temp files
because it affects pattern matching and with hashing.

This commit converts a handful of instances throughout the codebase
where it appears that O_BINARY was mistakenly omitted and could result
in unexpected behavior on Windows.

Git on Windows also converts LF -> CRLF for "text" files, for editing
purposes.
This is problematic for scan files and test files that should match
verbatim.
We can prevent this issue by marking .ref test files as "binary" in the
.gitattributes file and by always opening scan files and temp files as
binary.

In this commit I've also removed the `ChangeLog merge=cl-merge` line
that was once used to reduce ChangeLog merge conflicts by using the
gnulib git-merge-changlog tool. This project now categorizes changes in
the NEWS.md.
For finer detail, git commit history is fully accessible on github.com.
2021-02-25 11:41:28 -08:00
Micah Snyder
2552cfd0d1 CMake: Add CTest support to match Autotools checks
An ENABLE_TESTS CMake option is provided so that users can disable
testing if they don't want it. Instructions for how to use this
included in the INSTALL.cmake.md file.

If you run `ctest`, each testcase will write out a log file to the
<build>/unit_tests directory.

As with Autotools' make check, the test files are from test/.split
and unit_tests/.split files, but for CMake these are generated at
build time instead of at test time.

On Posix systems, sets the LD_LIBRARY_PATH so that ClamAV-compiled
libraries can be loaded when running tests.

On Windows systems, CTest will identify and collect all library
dependencies and assemble a temporarily install under the
build/unit_tests directory so that the libraries can be loaded when
running tests.

The same feature is used on Windows when using CMake to install to
collect all DLL dependencies so that users don't have to install them
manually afterwards.

Each of the CTest tests are run using a custom wrapper around Python's
unittest framework, which is also responsible for finding and inserting
valgrind into the valgrind tests on Posix systems.

Unlike with Autotools, the CMake CTest Valgrind-tests are enabled by
default, if Valgrind can be found. There's no need to set VG=1.
CTest's memcheck module is NOT supported, because we use Python to
orchestrate our tests.

Added a bunch of Windows compatibility changes to the unit tests.
These were primarily changing / to PATHSEP and making adjustments
to use Win32 C headers and ifdef out the POSIX ones which aren't
available on Windows. Also disabled a bunch of tests on Win32
that don't work on Windows, notably the mmap ones and FD-passing
(i.e. FILEDES) ones.

Add JSON_C_HAVE_INTTYPES_H definition to clamav-config.h to eliminate
warnings on Windows where json.h is included after inttypes.h because
json-c's inttypes replacement relies on it.
This is a it of a hack and may be removed if json-c fixes their
inttypes header stuff in the future.

Add preprocessor definitions on Windows to disable MSVC warnings about
CRT secure and nonstandard functions. While there may be a better
solution, this is needed to be able to see other more serious warnings.

Add missing file comment block and copyright statement for clamsubmit.c.
Also change json-c/json.h include filename to json.h in clamsubmit.c.
The directory name is not required.

Changed the hash table data integer type from long, which is poorly
defined, to size_t -- which is capable of storing a pointer. Fixed a
bunch of casts regarding this variable to eliminate warnings.

Fixed two bugs causing utf8 encoding unit tests to fail on Windows:
- The in_size variable should be the number of bytes, not the character
  count. This was was causing the SHIFT_JIS (japanese codepage) to UTF8
  transcoding test to only transcode half the bytes.
- It turns out that the MultiByteToWideChar() API can't transcode
  UTF16-BE to UTF16-LE. The solution is to just iterate over the buffer
  and flip the bytes on each uint16_t. This but was causing the UTF16-BE
  to UTF8 tests to fail.

I also split up the utf8 transcoding tests into separate tests so I
could see all of the failures instead of just the first one.

Added a flags parameter to the unit test function to open testfiles
because it turns out that on Windows if a file contains the \r\n it will
replace it with just \n if you opened the file as a text file instead of
as binary. However, if we open the CBC files as binary, then a bunch of
bytecode tests fail. So I've changed the tests to open the CBC files in
the bytecode tests as text files and open all other files as binary.

Ported the feature tests from shell scripts to Python using a modified
version of our QA test-framework, which is largely compatible and will
allow us to migrate some QA tests into this repo. I'd like to add GitHub
Actions pipelines in the future so that all public PR's get some testing
before anyone has to manually review them.

The clamd --log option was missing from the help string, though it
definitely works. I've added it in this commit.
It appears that clamd.c was never clang-format'd, so this commit also
reformats clamd.c.

Some of the check_clamd tests expected the path returned by clamd to
match character for character with original path sent to clamd. However,
as we now evaluate real paths before a scan, the path returned by clamd
isn't going to match the relative (and possibly symlink-ridden) path
passed to clamdscan. I fixed this test by changing the test to search
for the basename: <signature> FOUND within the response instead of
matching the exact path.

Autotools: Link check_clamd with libclamav so we can use our utility
functions in check_clamd.c.
2021-02-25 11:41:26 -08:00
Micah Snyder (micasnyd)
9e20cdf6ea Add CMake build tooling
This patch adds experimental-quality CMake build tooling.

The libmspack build required a modification to use "" instead of <> for
header #includes. This will hopefully be included in the libmspack
upstream project when adding CMake build tooling to libmspack.

Removed use of libltdl when using CMake.

Flex & Bison are now required to build.

If -DMAINTAINER_MODE, then GPERF is also required, though it currently
doesn't actually do anything.  TODO!

I found that the autotools build system was generating the lexer output
but not actually compiling it, instead using previously generated (and
manually renamed) lexer c source. As a consequence, changes to the .l
and .y files weren't making it into the build. To resolve this, I
removed generated flex/bison files and fixed the tooling to use the
freshly generated files. Flex and bison are now required build tools.
On Windows, this adds a dependency on the winflexbison package,
which can be obtained using Chocolatey or may be manually installed.

CMake tooling only has partial support for building with external LLVM
library, and no support for the internal LLVM (to be removed in the
future). I.e. The CMake build currently only supports the bytecode
interpreter.

Many files used include paths relative to the top source directory or
relative to the current project, rather than relative to each build
target. Modern CMake support requires including internal dependency
headers the same way you would external dependency headers (albeit
with "" instead of <>). This meant correcting all header includes to
be relative to the build targets and not relative to the workspace.

For example, ...

```c
include "../libclamav/clamav.h"
include "clamd/clamd_others.h"
```

... becomes:

```c
// libclamav
include "clamav.h"

// clamd
include "clamd_others.h"
```

Fixes header name conflicts by renaming a few of the files.

Converted the "shared" code into a static library, which depends on
libclamav. The ironically named "shared" static library provides
features common to the ClamAV apps which are not required in
libclamav itself and are not intended for use by downstream projects.
This change was required for correct modern CMake practices but was
also required to use the automake "subdir-objects" option.
This eliminates warnings when running autoreconf which, in the next
version of autoconf & automake are likely to break the build.

libclamav used to build in multiple stages where an earlier stage is
a static library containing utils required by the "shared" code.
Linking clamdscan and clamdtop with this libclamav utils static lib
allowed these two apps to function without libclamav. While this is
nice in theory, the practical gains are minimal and it complicates
the build system. As such, the autotools and CMake tooling was
simplified for improved maintainability and this feature was thrown
out. clamdtop and clamdscan now require libclamav to function.

Removed the nopthreads version of the autotools
libclamav_internal_utils static library and added pthread linking to
a couple apps that may have issues building on some platforms without
it, with the intention of removing needless complexity from the
source. Kept the regular version of libclamav_internal_utils.la
though it is no longer used anywhere but in libclamav.

Added an experimental doxygen build option which attempts to build
clamav.h and libfreshclam doxygen html docs.

The CMake build tooling also may build the example program(s), which
isn't a feature in the Autotools build system.

Changed C standard to C90+ due to inline linking issues with socket.h
when linking libfreshclam.so on Linux.

Generate common.rc for win32.

Fix tabs/spaces in shared Makefile.am, and remove vestigial ifndef
from misc.c.

Add CMake files to the automake dist, so users can try the new
CMake tooling w/out having to build from a git clone.

clamonacc changes:
- Renamed FANOTIFY macro to HAVE_SYS_FANOTIFY_H to better match other
  similar macros.
- Added a new clamav-clamonacc.service systemd unit file, based on
  the work of ChadDevOps & Aaron Brighton.
- Added missing clamonacc man page.

Updates to clamdscan man page, add missing options.

Remove vestigial CL_NOLIBCLAMAV definitions (all apps now use
libclamav).

Rename Windows mspack.dll to libmspack.dll so all ClamAV-built
libraries have the lib-prefix with Visual Studio as with CMake.
2020-08-13 00:25:34 -07:00
Micah Snyder
e2f59af30a Clang-format touchup 2020-07-24 16:37:25 -07:00
Micah Snyder
005cbf5a37 Record names of extracted files
A way is needed to record scanned file names for two purposes:

1. File names (and extensions) must be stored in the json metadata
properties recorded when using the --gen-json clamscan option. Future
work may use this to compare file extensions with detected file types.

2. File names are useful when interpretting tmp directory output when
using the --leave-temps option.

This commit enables file name retention for later use by storing file
names in the fmap header structure, if a file name exists.

To store the names in fmaps, an optional name argument has been added to
any internal scan API's that create fmaps and every call to these APIs
has been modified to pass a file name or NULL if a file name is not
required.  The zip and gpt parsers required some modification to record
file names.  The NSIS and XAR parsers fail to collect file names at all
and will require future work to support file name extraction.

Also:

- Added recursive extraction to the tmp directory when the
  --leave-temps option is enabled.  When not enabled, the tmp directory
  structure remains flat so as to prevent the likelihood of exceeding
  MAX_PATH.  The current tmp directory is stored in the scan context.

- Made the cli_scanfile() internal API non-static and added it to
  scanners.h so it would be accessible outside of scanners.c in order to
  remove code duplication within libmspack.c.

- Added function comments to scanners.h and matcher.h

- Converted a TDB-type macros and LSIG-type macros to enums for improved
  type safey.

- Converted more return status variables from `int` to `cl_error_t` for
  improved type safety, and corrected ooxml file typing functions so
  they use `cli_file_t` exclusively rather than mixing types with
  `cl_error_t`.

- Restructured the magic_scandesc() function to use goto's for error
  handling and removed the early_ret_from_magicscan() macro and
  magic_scandesc_cleanup() function.  This makes the code easier to
  read and made it easier to add the recursive tmp directory cleanup to
  magic_scandesc().

- Corrected zip, egg, rar filename extraction issues.

- Removed use of extra sub-directory layer for zip, egg, and rar file
  extraction.  For Zip, this also involved changing the extracted
  filenames to be randomly generated rather than using the "zip.###"
  file name scheme.
2020-06-03 10:39:18 -04:00
Micah Snyder (micasnyd)
485d8dec67 Check test support for check 0.13
Tests in libcheck 0.13 must have {} between START_TEST and END_TEST
else it will not compile.

Also replaced all deprecated "fail_" macros with "ck_" macros.
E.g. fail_unless() becomes ck_assert_msg()

The checks_common.h header file provided a couple of macros to
support versions older than 0.9.3.  As these older versions are
no longer relevant, I've removed those compatibility macros
entirely.
2020-01-15 08:14:23 -08:00
Micah Snyder
206dbaefe8 Update copyright dates for 2020 2020-01-03 15:44:07 -05:00
Micah Snyder
52cddcbcfd Updating and cleaning up copyright notices. 2019-10-02 16:08:18 -04:00
Micah Snyder
72fd33c8b2 clang-format'd using new .clang-format rules. 2019-10-02 16:08:16 -04:00
Micah Snyder
d7979d4ff7 Restructured scan options flags from a single bitflag field to a structure containing multiple bitflag fields. This also required adding a new function to the bytecode API to get scan options a la carte, and modifying the existing function to hand back scan options in the old/deprecated uint32_t bitflag format. Re-generated bytecode iface header files.
Updated libclamav documentation detailing new scan options structure.
Renamed references to 'algorithmic' detection to 'heuristic' detection. Renaming references to 'properties' to 'collect metadata'.
Renamed references to 'scan all' to 'scan all match'.
Renamed a couple of 'Hueristic.*' signature names as 'Heuristics.*' signatures (plural) to match majority of other heuristics.
2018-12-02 23:06:59 -05:00
Mickey Sola
46a35abe56 mass update of copyright headers 2015-09-17 13:41:26 -04:00
Shawn Webb
60d8d2c352 Move all the crypto API to clamav.h 2014-07-01 19:38:01 -04:00
Shawn Webb
b2e7c931d0 Use OpenSSL for hashing. 2014-02-08 00:31:12 -05:00
Steve Morgan
54402320c0 Add bytecode performance statistics 2012-12-05 15:48:52 -08:00
Steve Morgan
03b99d0311 fix compiler warning 2012-10-19 09:39:06 -07:00
Steve Morgan
6ad45a2931 add initial allscan/allmatch mode to libclamav, clamd, clamdscan, and clamscan with unit tests 2012-10-18 14:12:58 -07:00
Török Edvin
62ee12b2f8 unit tests for new fmap scan API 2011-06-14 22:35:03 +03:00
Török Edvin
5007986ffd Fix build on Etch (bb #2399). 2011-01-20 10:03:55 +02:00
Török Edvin
73489b8b4c s/glibc 2.2/glibc 2.3/ for pthread_barrier. 2010-11-23 16:38:15 +02:00
Török Edvin
314c415b02 Restrict usage of pthread_barrier even more.
Don't want to add a configure test for it now...
2010-11-23 13:31:47 +02:00
Török Edvin
e3a9786792 Only use pthread_barrier* on Linux.
Looks like Solaris and Mac OS X didn't hear of barriers :(
2010-11-04 23:01:22 +02:00
Török Edvin
71a5cb434e Fix build error. 2010-11-04 22:13:35 +02:00
Török Edvin
a7cf187a0c Make cl_load thread safe (bb #2333).
Parallel cl_load() crash (bb #2333).
Reason is twofold:
 - cache.c had 2 'static' global variables, thus trying to initialize same cache
 from multiple threads
  - bytecode2llvm.cpp: something in LLVM 2.7 is crashing when loading in
  parallel

Fix is to drop the 'static' on the variable (cache is per engine already).
This also fixes a potential memory leak in clamd!

The other part of the fix is to turn on the mutex around bytecode compilation
always. We don't call cl_load in parallel, so this doesn't affect clamd, but
some may need to call cl_load in parallel.
2010-11-04 21:53:03 +02:00
Török Edvin
47cee5042d fix unit test for JIT test mode. 2010-10-18 11:08:14 +03:00
Török Edvin
540fc128a0 freshclam is using private symbol that changed proto (bb #2187).
Change name to prevent crash with 0.96.1 freshclam and 0.96.2 libclamav.
You'll get a missing symbol error.
2010-08-11 14:26:10 +03:00
Török Edvin
213dfdff06 run 1 unit-test at least in test mode (bb #2151).
Also allow running test mode if JIT is not available, still checking
for failed startup.cbc execution.
2010-08-02 19:00:12 +03:00
Török Edvin
d049a2f72b Make bytecode tests use testmode if they want.
None uses it yet.
2010-07-29 14:07:00 +03:00
Török Edvin
927d054838 Add engine param to bytecode, and remove dconf from _init. 2010-07-29 13:48:18 +03:00
Török Edvin
1ab57a63c7 Add bytecode.cvd load test. 2010-05-14 17:19:26 +03:00
Török Edvin
3d2808c218 bytecode: update unit tests for improved arithmetic test. 2010-05-14 10:41:50 +03:00
Török Edvin
aadccfd1c8 Fix valgrind warnings. 2010-05-13 23:35:47 +03:00
Török Edvin
fc01c6476f Fix interpreter. 2010-05-13 23:25:11 +03:00
Török Edvin
a969167b6c Add new bytecode API unit tests. 2010-05-13 22:44:29 +03:00
Török Edvin
e4a0f2c94f fix compiler warnings (bb #1872, bb #1934, bb #1935) 2010-04-13 16:19:47 +03:00
Török Edvin
d772904022 Fix matchwithread.cbc
ImageBase is little-endian, need to use conversion
function to access it.
2010-04-02 13:13:17 +03:00
Török Edvin
e407d32d04 Increase timeout of testcase itself. 2010-03-30 11:10:58 +03:00
Török Edvin
ce288463e2 Increase bytecode timeout for non-timeout tests. 2010-03-30 00:04:38 +03:00
Török Edvin
fa82ce037a Separate bytecode tests into jit and interpreter.
This makes it easier to see which one has a problem, and also reduces the
runtime of individual tests.
2010-03-29 12:09:07 +03:00
Török Edvin
1678ef9e43 Fix inflate.cbc for the interpreter. 2010-03-29 11:38:52 +03:00
Török Edvin
0d9b99f43e Fix bswap.cbc in interpreter mode. 2010-03-28 23:49:25 +03:00
Török Edvin
041bc64aab Increase timeout in unit test (bb #1899). 2010-03-26 16:50:30 +02:00
Török Edvin
e439954b51 Fix valgrind warnings. 2010-03-24 17:37:23 +02:00
Török Edvin
778df8c22f Fix more leaks. 2010-03-24 17:08:20 +02:00
Török Edvin
6ea339aeab Fix bswap. 2010-03-24 15:27:15 +02:00
Török Edvin
48fc8b9852 Leak testcase. 2010-03-24 14:14:33 +02:00
Török Edvin
b26d43809a Add matchwithread.cbc to unit tests. 2010-03-24 12:46:34 +02:00
Török Edvin
74f5816c58 Interpreter fixes for accessing 'ctx'.
This allow all cbcs in unit_tests/input to pass.
Not yet working on bytecode.cvd though.
2010-03-23 21:47:57 +02:00
Török Edvin
bdd9aeaeeb Use a watchdog thread. Also make timeout be ms instead of us. 2010-03-23 16:33:41 +02:00