What is Discovery Informatics?
Discovery Informatics is concerned with the creation of new information from
existing information, whether previously stored, or as it flows through a
communication channel. How can we apply existing tools and create new tools
to help us to discover new information, validate what we discover, and The information revolution did not start with the computer. Gutenberg was responsible for that when it became possible to copy information for mass distribution. within
the human context, enter the discovery into our knowledge base?
The term "Discovery Informatics" was defined by William
W. Agresti in 2003, as follows: "Discovery
Informatics is the study and practice of employing the full spectrum
of computing and analytical science and technology to the singular
pursuit of discovering
new information by identifying and validating patterns in data."
The College of Charleston faculty, in the development of our undergraduate
program, have enhanced the definition by explicitly articulating the
contributing fields of Bayesian statistics, mathematical modeling,
and computer programming.
It's about gleaning new knowledge from existing information. Until
the 20th century, information was the critical input that helped
people solve problems. In the 20th century information became
the solution.
Information itself was the product of commerce – from science
to the entertainment industry. And now, at the beginning of the 21st
century,
information has become
a problem – there seems to be no end to
how much
information
can be generated, communicated and stored, often without intention
or value.
Discovery Informatics does not purport to solve the information glut.
Instead the aim is to find meaning in information that is otherwise
not useful – or not as useful as it could be – because
of its sheer volume. Discovery Informatics
has already had success in transforming the way we do biology (as bioInformatics
and genomics), in medicine (as medical Informatics), and in pharmacology
(in drug Informatics and drug discovery).
The information revolution did not start with the computer. Gutenberg
was responsible for that when it became possible to copy information
for mass distribution. Today networks copy and transport information
nearly without cost, with respect to paper-based publication. Working
with information stored as bits rather than atoms has made all the
difference. With the ease of information publication and transmission.
Every person and computational device connected to the information
grid can both publish and consume information. Discovery Informatics
is a challenging but quantitative response to find added value by thinking
in new ways made available only as a partnership between humans and
machine.
The Roots of Discovery Informatics
Discovery Informatics is based on the following three broad disciplines:
Deductive Inference | Inductive
Inference | Dynamic Inference
Deductive Inference:
Logic studies the rational principles involved in determining when certain statements follow necessarily from other premises.
— If I believe such-and-such is true, then what other things am I forced to believe
as true?
—
If I determine that something is false, what implications does this
have on my other beliefs?
Proof Theory studies the way true statements are proved
with an eye towards automating this process so that a computer can
discover new truths based on logic.
—
Which proofs require a spark of
genius from a human and which proofs can be effectively discovered
by a computer?
—
How can we mechanize proofs and logical operations?
Computability studies problems of computation and decidability.
It has been known since the 1930s that some precisely stated problems
are undecidable, either by human or machine.
—
Which problems are decidable,
i.e., have solutions?
—
Can we recognize those problems that will potentially
yield solutions, thereby avoiding thorny problems with no solutions?
—
For
those problems that are decidable, can we build algorithms that will
solve them?
Inductive Inference:
Statistics studies means of inferring universal truths by looking
only at examples.
—
How can we extract and measure useful information from an entire
population using random subsamples of that population and prior knowledge?
—
How
can we predict the outcome of an election with millions of voters by taking a
random poll of 500 voters?
—
How can we predict the effects of a new drug on all
potential users by testing it on a carefully selected few?
—
How can we find genetic
markers for serious diseases?
Learning Theory studies how we and machines learn to
predict, simulate, and understand the input/output patterns of a complicated
"black-box" mechanisms by seeing training examples.
—
How can we
design computers to recognize someone's face or speech?
—
Which features
of a dataset do we extract that will prove useful for learning?
—
For
example, which features of the human face do we use to detect gender?
Very young children learn to recognize gender very accurately, whereas
today's
most sophisticated computers do a relatively poor job. But, the vast
majority of mail in the U.S. is sorted by computers finding and then
reading
the zip codes,
many of which are handwritten. The error rate of these computer scanners
is lower than human scanners.
Information Theory studies effective ways to encode all
sorts of information and knowledge suitable for analysis, storage,
searching, and communicating.
—
How can images, text, and audio files
be encoded effectively?
—
How much can you compress a piece of information
like a picture and still retain its essential features?
—
How do we retrieve
information that has been corrupted by noise?
—
How do we measure the
complexity of information and knowledge?
Artificial Intelligence explores working definitions
of intelligence and ways of building machines that are intelligent.
—
How
do we represent knowledge?
—
How do we reason under uncertainty?
Researchers
in all fields are very good at forming hypotheses.
—
To what extent can
the process of hypothesis formation and goal setting be automated?
—
How
do we recognize if and when a machine is displaying intelligence?
—
When
and how can we trust intelligent machines?
This field remains
controversial and often devolves into philosophical debate.
Dynamic Inference
Computational Complexity studies how hard it is to find
and verify solutions to specific decidable problems.
—
Can a solution
be executed
in a reasonable amount of time using a reasonable amount of memory?
—
How
difficult is it to simulate a complicated process?
—
Can we engineer
sufficiently fast algorithms to control a process or analyze a data
stream in real time?
—
How can we recognize effectively solvable problems
and questions?
Complex Systems studies how seemingly simple rules can
display exceedingly complicated emergent behavior over time.
—
How can
we recognize when complicated behavior or data is the result of simply
understandable root causes having evolved over time?
—
How can we reconstruct
these simple dynamical rules?
—
Having these rules in hand, can we predict
or understand the resulting emergent behavior from these rules without
the need to observe the system over time?
The quintessential example
of a complex adaptive system is how species evolve over eons of time.
Evolutionary Algorithms studies the feasibility and effectiveness
of having algorithmic code evolve in much the same way as species evolve.
Different pieces of algorithmic code mutate, combine, and compete with
respect to some fitness criteria, hopefully evolving into an effective
and otherwise undiscovered algorithm for solving interesting problems,
analyzing data, or discovering important knowledge.
—
Given a specific
goal, how do we start this algorithmic evolutionary process in such
a way as to be successful?
—
Can evolutionary algorithms discover new
knowledge, and if so, can we understand and leverage the resulting
algorithmic code?
There exist important successes of evolutionary algorithms
in computer and electrical circuit design, with these algorithms rediscovering
existing patents and most surprisingly issuing new patents making the
older ones obsolete.
Learn more:
Discovery Informatics FAQs | About
the Degree Program |