Abstract:
Wireless Sensor Networks (WSN) are being progressively used in several application
areas, particularly to collect data and monitor physical processes.
Moreover, sensor nodes used in environmental monitoring applications, such
as the aquatic sensor networks, are often subject to harsh environmental conditions
while monitoring complex phenomena. Non-functional requirements,
like reliability, security or availability, are increasingly important and must be
accounted for in the application development. For that purpose, there is a
large body of knowledge on dependability techniques for distributed systems,
which provides a good basis to understand how to satisfy these non-functional
requirements of WSN-based monitoring applications. Given the data-centric
nature of monitoring applications, it is of particular importance to ensure that
data is reliable or, more generically, that it has the necessary quality.
The problem of ensuring the desired quality of data for dependable monitoring
using WSNs is studied herein. With a dependability-oriented perspective,
it is reviewed the possible impairments to dependability and the prominent
existing solutions to solve or mitigate these impairments. Despite the variety
of components that may form a WSN-based monitoring system, it is given
particular attention to understanding which faults can affect sensors, how
they can affect the quality of the information, and how this quality can be
improved and quantified. Open research issues for the specific case of aquatic
monitoring applications are also discussed.
One of the challenges in achieving a dependable system behavior is to overcome
the external disturbances affecting sensor measurements and detect the
failure patterns in sensor data. This is a particular problem in environmental
monitoring, due to the difficulty in distinguishing a faulty behavior from
the representation of a natural phenomenon. Existing solutions for failure
detection assume that physical processes can be accurately modeled, or that
there are large deviations that may be detected using coarse techniques, or
more commonly that it is a high-density sensor network with value redundant
sensors.
This thesis aims at defining a new methodology for dependable data quality
in environmental monitoring systems, aiming to detect faulty measurements
and increase the sensors data quality. The framework of the methodology is
overviewed through a generically applicable design, which can be employed to
any environment sensor network dataset.
The methodology is evaluated in various datasets of different WSNs, where it is
used machine learning to model each sensor behavior, exploiting the existence
of correlated data provided by neighbor sensors. It is intended to explore
the data fusion strategies in order to effectively detect potential failures for
each sensor and, simultaneously, distinguish truly abnormal measurements
from deviations due to natural phenomena. This is accomplished with the
successful application of the methodology to detect and correct outliers, offset
and drifting failures in real monitoring networks datasets.
In the future, the methodology can be applied to optimize the data quality
control processes of new and already operating monitoring networks, and assist
in the networks maintenance operations.