Files
weewx/docs/devnotes.htm

402 lines
23 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>weewx: Developer's Notes</title>
<meta http-equiv="Content-Language" content="en-us"/>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="css/ui-lightness/jquery-ui-1.10.4.custom.min.css"/>
<link rel="stylesheet" href="css/jquery.tocify.css"/>
<link rel="stylesheet" href="css/weewx_docs.css"/>
<link rel="icon" href="images/favicon.png" type="image/png">
</head>
<body>
<div class="sidebar">
<div class="doclist">
<a href="usersguide.htm">User's Guide</a><br/>
<a href="customizing.htm">Customization Guide</a><br/>
<a href="hardware.htm">Hardware Guide</a><br/>
<a href="utilities.htm">Utilities Guide</a><br/>
<a href="upgrading.htm">Upgrade Guide</a><br/>
<a href="devnotes.htm">Notes for Developers</a>
</div>
<div id="toc_controls"></div>
<div id="toc_parent">
<div id="toc">
<!-- The table of contents will be injected here -->
</div>
</div>
</div>
<div class="main">
<div class="header">
<div class="logoref">
<a href='http://weewx.com'> <img src='images/logo-weewx.png' class='logo' align='right' alt="weewx logo"/>
</a><br/> <span class='version'>
Version: 3.6.1
</span>
</div>
<div class="title">
Notes for Developers of the <span class="code">weewx</span> Weather System
</div>
</div>
<div id="technical_content" class="content">
<p>This guide is intended for developers contributing to the open source project <span class="code">weewx</span>.</p>
<h1>Goals</h1>
<p>The primary design goals of <span class="code">weewx</span> are:</p>
<ul>
<li>Architectural simplicity. No semaphores, no named pipes, no
inter-process communications, no complex multi-threading to
manage.
</li>
<li>Extensibility. Make it easy for the user to add new features or to modify existing features.
</li>
<li>&quot;Fast enough&quot; In any design decision, architectural simplicity and elegance trump speed.
</li>
<li>One code base. A single code base should be used for all platforms, all weather stations, all reports,
and any combination of features. Ample configuration and customization options should be provided so the
user does not feel tempted to start hacking code. At worse, the user may have to subclass, which is much
easier to port to newer versions of the code base, than customizing the base code.
</li>
<li>Minimal dependencies. The code should rely on a minimal
number of external packages, so the user does not have to go
chase them down all over the Web before getting started.
</li>
<li>Simple data model. The implementation should use a very
simple data model that is likely to support many different
types of hardware.
</li>
<li>A &quot;pythonic&quot; code base. The code should be written
so others will find idioms that they recognize.
</li>
</ul>
<h1>Strategies</h1>
<p>To meet these goals, the following strategies were used: </p>
<ul>
<li>A &quot;micro-kernel&quot; design. The <span class="code">weewx</span> engine actually does very little.
Its primary job is to load and run <em> services</em> at runtime, making it easy for users to add or
subtract features.
</li>
<li>A largely stateless design style. For example, many of the processing routines read the data they need
directly from the database, rather than caching it and sharing with other routines. While this means the
same data may be read multiple times, it also means the only point of possible cache incoherence is
through the database, where transactions are easily controlled. This greatly reduces the chances of
corrupting the data, making it much easier to understand and modify the code base.
</li>
<li>Isolated data collection and archiving. The code for
collecting and archiving data run in a single thread that is
simple enough that it is unlikely to crash. The report
processing is where most mistakes are likely to happen, so
isolate that in a separate thread. If it crashes, it will not
affect the main data thread.
</li>
<li>A powerful configuration parser. The
<a href="http://www.voidspace.org.uk/python/configobj.html">ConfigObj</a> module, by Michael Foord and Nicola
Larosa, was chosen to read the configuration file. This allows many options that might otherwise have to
go in the code to go instead in a configuration file.
</li>
<li>A powerful templating engine. The
<a href="http://www.cheetahtemplate.org/">Cheetah</a> module
was chosen for generating html and other types of files from
templates. Cheetah
allows <em>search list extensions</em> to be defined, making it easy to extend weewx with new template
tags. Unfortunately, as of 2016, Cheetah has not been updated in many years (indeed, the Cheetah website
seems to be dead). Fortunately, Cheetah seems to be very robust, with only a few well-known bugs that
are easiy worked around, so we will likely continue to use it for the foreseeable future.
</li>
<li>Pure Python. The code base is 100% Python &mdash; no underlying C libraries need be built to install
<span class="code">weewx</span>. This also means no Makefiles are needed.
</li>
</ul>
<p>While <span class="code">weewx</span> is nowhere near as fast at generating images and HTML as its
predecessor, <span class="code">wview</span> (this is partially because <span class="code">weewx</span> uses
fancier fonts and a much more powerful templating engine), it is fast enough for all platforms but the
slowest. I run it regularly on a 500 MHz machine where generating the 9 images used in the &quot;Current
Conditions&quot; page takes just under 2 seconds (compared with 0.4 seconds for <span
class="code">wview</span>). </p>
<p>Unfortunately, the architectural goal of one code base is likely to be broken with the arrival of Python
V3.X. It has so many changes that are not backwards compatible with V2.X, that a separate code base will
most likely be needed. My intention is to stick with the V2.X versions until V3.X is so widespread it cannot
be ignored, then make a permanent switch. Given the slow adoption rate of V3.X this is unlikely to happen
anytime soon. In any case, I doubt the transition will affect the average <span class="code">weewx</span>
user. </p>
<p>All writes to the databases are protected by transactions. You can kill the program at any time (either
Control-C if run directly or &quot;<span class="code">/etc/init.d/weewx
stop</span>&quot; if run as a daemon) without fear of corrupting the databases. </p>
<p>The code makes ample use of exceptions to insure graceful recovery from problems such as network outages. It
also monitors socket and console timeouts, restarting whatever it was working on several times before giving
up. In the case of an unrecoverable console error (such as the console not responding at all), the program
waits 60 seconds then restarts the program from the top.</p>
<p>Any &quot;hard&quot; exceptions, that is those that do not involve network and console timeouts and are most
likely due to a logic error, are logged, reraised, and ultimately cause thread termination. If this happens
in the main thread (not likely due to its simplicity), then this causes program termination. If it happens
in the report processing thread (much more likely), then only the generation of reports will be
affected &mdash; the main thread will continue downloading data off the instrument and putting them in the
database. You can fix the problem at your leisure, without worrying about losing any data. </p>
<h1>Units</h1>
<p>In general, there are three different areas where the unit system makes a difference: </p>
<ol>
<li>On the weather station hardware. Different manufacturers use different unit systems for their hardware.
The Davis Vantage series use U.S. Customary units exclusively, Fine Offset and LaCrosse stations use
metric, while Oregon Scientific, Peet Bros, and Hideki stations use a mishmash of US and metric.
</li>
<li>In the database. Either US or Metric can be used.</li>
<li>In the presentation (i.e., html and image files).</li>
</ol>
<p>The general strategy is that measurements are converted by service <span class="code">StdConvert</span> as
they come off the weather station into a target unit system, then stored internally in the database in that
unit system. Then, as they come off the database to be used for a report, they are converted into a target
unit, specified by the skin. </p>
<h1>Value &quot;<span class="code">None</span>&quot;</h1>
<p>The Python special value <span class="code">None</span> is used throughout to signal a missing data point.
All functions must be written to expect it. </p>
<p>Device drivers should be written to emit <span class="code">None</span> if a data value is bad (perhaps
because of a failed checksum). If the hardware simply doesn't support it, then the driver should not emit a
value at all.</p>
<p>However, the time value must never be <span class="code">None</span>. This is because it is used as the
primary key in the SQL database. </p>
<h1>Time</h1>
<p><span class="code">Weewx</span> stores all data in UTC (roughly, &quot;Greenwich&quot; or &quot;Zulu&quot;)
time. However, usually one is interested in weather events in local time and want image and HTML generation
to reflect that. Furthermore, most weather stations are configured in local time. This requires that many
data times be converted back and forth between UTC and local time. To avoid tripping up over time zones and
daylight savings time, <span class="code">weewx</span> generally uses Python routines to do this conversion.
Nowhere in the code base is there any explicit recognition of DST. Instead, its presence is implicit in the
conversions. At times, this can cause the code to be relatively inefficient. </p>
<p>For example, if one wanted to plot something every 3 hours in UTC time, it would be very simple: to get the
next plot point, just add 10,800 to the epoch time: </p>
<pre class="tty">next_ts = last_ts + 10800 </pre>
<p>But, if one wanted to plot something for every 3 hours <em>in local time</em> (that is, at 0000, 0300, 0600,
etc.), despite a possible DST change in the middle, then things get a bit more complicated. One could modify
the above to recognize whether a DST transition occurs sometime between <span class="code">last_ts</span>
and the next three hours and, if so, make the necessary adjustments. This is generally what <span
class="code">wview</span> does. <span class="code">Weewx</span> takes a different approach and
converts from UTC to local, does the arithmetic, then converts back. This is inefficient, but bulletproof
against changes in DST algorithms, etc: </p>
<pre class="tty">time_dt = datetime.datetime.fromtimestamp(last_ts)
delta = datetime.timedelta(seconds=10800)
next_dt = time_dt + delta
next_ts = int(time.mktime(next_dt.timetuple()))</pre>
<p>Other time conversion problems are handled in a similar manner. </p>
<h1>Internationalization</h1>
<p>Generally, weewx does not make much use of Unicode. This is because the Python 2.x libraries do not always
handle it correctly. In particular, the function <span class="code">time.strftime()</span> completely fails
when handed a Unicode string with a non-ASCII character. As this function is often used by extensions,
working around this bug is an unfair expectation on extension writers. So, we generally avoid Unicode.</p>
<p>Instead, weewx mostly uses regular strings, with any non-ASCII characters encoded as UTF-8. </p>
<p>An exception to this general rule is the image generator, which holds labels internally in Unicode, because
that is the encoding expected by most fonts.</p>
<p>The document <i><a
href="https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/">The
Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character
Sets</a></i> by Joel Spolsky, is highly recommended if you are just starting to work with UTF-8 and Unicode.
</p>
<h1>Exceptions</h1>
<p>In general, your code should not simply swallow an exception. For example, this is bad form:</p>
<pre class="tty">
try:
os.rename(oldname, newname)
except:
pass
</pre>
<p>While the odds are that if an exception happens it will be because the file <span class="code">oldname</span>
does not exist, that is not guaranteed. It could be because of a keyboard interrupt, or a corrupted file
system, or something else. Instead, you should test explicitly for any expected exception, and let the rest
go by:</p>
<pre class="tty">
try:
os.rename(oldname, newname)
except OSError:
pass
</pre>
<p>Weewx has a few specialized exception types, used to rationalized all the different types of exceptions that
could be thrown by the underlying libraries. In particular, low-level I/O code can raise a myriad of
exceptions, such as USB errors, serial errors, network connectivity errors, <i>etc.</i> All device drivers
should catch these exceptions and convert them into an exception of type <span
class="code">WeeWxIOError</span> or one of its subclasses.</p>
<h1>Glossary</h1>
<p>This is a glossary of terminology used throughout the code. </p>
<table style="width: 95%" class='indent'>
<caption>Terminology used in weewx</caption>
<tr class="first_row">
<td>Name</td>
<td>Description</td>
</tr>
<tr>
<td class="text_highlight">archive interval</td>
<td>Weewx does not store the raw data that comes off a weather station. Instead, it aggregates the data
over a length of time, the <em>archive interval</em>, and then stores that.
</td>
</tr>
<tr>
<td class="text_highlight">archive record</td>
<td>While <em>packets</em> are raw data that comes off the weather station, <em>records</em> are data
aggregated by time. For example, temperature may be the average temperature over an <em>archive
interval</em>. These are the data stored in the SQL database
</td>
</tr>
<tr>
<td class="text_highlight"><span class="code">config_dict</span></td>
<td>All configuration information used by weewx is stored in the <em>configuration file</em>, usually
with the name <span class="code">weewx.conf</span>. By convention, when this file is read into the
program, it is called <span class="code">config_dict</span>, an instance of the class <span
class="code">configobj.ConfigObj</span>.
</td>
</tr>
<tr>
<td class="text_highlight">datetime</td>
<td>An instance of the Python object <span class="code">datetime.datetime</span>. Variables of type
datetime usually have a suffix <span class="code">_dt</span>.
</td>
</tr>
<tr>
<td class="text_highlight">db_dict</td>
<td>A dictionary with all the data necessary to bind to a database. An example for SQLite would be <span
class="code">
{'driver':'db.sqlite',
'root':'/home/weewx',
'database_name':'archive/weewx.sdb'}</span>, an example for MySQL would be <span class="code">{
'driver':'db.mysql',
'host':'localhost',
'user':'weewx',
'password':'mypassword',
'database_name':'weewx'}</span>.
</td>
</tr>
<tr>
<td class="text_highlight">epoch time</td>
<td>Sometimes referred to as &quot;unix time,&quot; or &quot;unix epoch time.&quot; The number of
seconds since the epoch, which is 1 Jan 1970 00:00:00 UTC. Hence, it always represents UTC (well...
after adding a few leap seconds. But, close enough). This is the time used in the databases and
appears as type <span class="code">
dateTime</span> in the SQL schema, perhaps an unfortunate name because of the similarity to the completely
unrelated Python type <span class="code">datetime</span>. Very easy to manipulate, but it is a big
opaque number.
</td>
</tr>
<tr>
<td class="text_highlight">LOOP packet</td>
<td>The real-time data coming off the weather station. The terminology "LOOP" comes from the Davis
series. A LOOP packet can contain all observation types, or it may contain only some of them
("Partial packet").
</td>
</tr>
<tr>
<td class="text_highlight">observation&nbsp;type</td>
<td>A physical quantity measured by a weather station (<i>e.g.</i>, <span class="code">outTemp</span>)
or something derived from it (<i>e.g.</i>, <span class="code">dewpoint</span>).
</td>
</tr>
<tr>
<td class="text_highlight"><span class="code">skin_dict</span></td>
<td>All configuration information used by a particular skin is stored in the <em>skin configuration
file</em>, usually with the name <span class="code">skin.conf</span>. By convention, when this file
is read into the program, it is called <span class="code">skin_dict</span>, an instance of the class
<span class="code">configobj.ConfigObj</span>.
</td>
</tr>
<tr>
<td class="text_highlight">SQL type</td>
<td>A type that appears in the SQL database. This usually looks something like <span class="code">outTemp</span>,
<span class="code">barometer</span>, <span class="code">extraTemp1</span>, and so on.
</td>
</tr>
<tr>
<td class="text_highlight">standard unit system</td>
<td>A complete set of units used together. Either <span class="code">US</span>, <span class="code">METRIC</span>,
or <span class="code">METRICWX</span>.
</td>
</tr>
<tr>
<td class="text_highlight">time stamp</td>
<td>A variable in unix epoch time. Always in UTC. Variables carrying a time stamp usually have a suffix
<span class="code">_ts</span>.
</td>
</tr>
<tr>
<td class="text_highlight">tuple-time</td>
<td>An instance of the Python object <span class="code">
<a href="http://docs.python.org/2/library/time.html#time.struct_time">
time.struct_time</a></span>. This is a 9-wise tuple that represent a time. It could be in either local
time or UTC, though usually the former. See module <span class="code">
<a href="http://docs.python.org/2/library/time.html">time</a></span> for more information. Variables
carrying tuple time usually have a suffix <span class="code">_tt</span>.
</td>
</tr>
<tr>
<td class="text_highlight">value tuple</td>
<td>A 3-way tuple. First element is a value, second element the unit type the value is in, the third the
unit group. An example would be <span class="code">(21.2,
&#39;degree_C&#39;, &#39;group_temperature&#39;)</span>.
</td>
</tr>
</table>
</div> <!--- end technical_content -->
<div class="footer">
<p class="copyright"> &copy; <a href="copyright.htm">Copyright</a> Tom Keffer </p>
</div>
</div>
<!-- Our scripts load last so the content can load first -->
<script type="text/javascript" src="js/jquery-1.11.1.min.js"></script>
<script type="text/javascript" src="js/jquery-ui-1.10.4.custom.min.js"></script>
<script type="text/javascript" src="js/jquery.tocify-1.9.0.min.js"></script>
<script type="text/javascript" src="js/weewx.js"></script>
<script type="text/javascript">
$(function () {
var level = get_default_level();
create_toc_control(level);
generate_toc(level);
});
</script>
<script type="text/javascript">
function showtab(tab, id) {
document.getElementById(tab + '-deb').style.display = 'none';
document.getElementById(tab + '-rpm').style.display = 'none';
document.getElementById(tab + '-macos').style.display = 'none';
document.getElementById(tab + '-setup').style.display = 'none';
document.getElementById(tab + '-' + id).style.display = 'block';
document.getElementById(tab + '-tab-deb').className = 'tab';
document.getElementById(tab + '-tab-rpm').className = 'tab';
document.getElementById(tab + '-tab-macos').className = 'tab';
document.getElementById(tab + '-tab-setup').className = 'tab';
document.getElementById(tab + '-tab-' + id).className = 'tab selected';
}
</script>
</body>
</html>