spacer
spacer
College of Charleston
About CofC Departments Library Bookstore Athletics spacer
Cougar Trailpaw WebMail WebCT Technology Search  
Discovery Informatics Program | masthead
About Discovery Informatics | linkAbout the Program | linkContact Us | link

What is Discovery Informatics?

Discovery Informatics is concerned with the creation of new information from existing information, whether previously stored, or as it flows through a communication channel. How can we apply existing tools and create new tools to help us to discover new information, validate what we discover, and
The information revolution did not start with the computer. Gutenberg was responsible for that when it became possible to copy information for mass distribution.
within the human context, enter the discovery into our knowledge base?

The term "Discovery Informatics" was defined by William W. Agresti in 2003, as follows: "Discovery Informatics is the study and practice of employing the full spectrum of computing and analytical science and technology to the singular pursuit of discovering new information by identifying and validating patterns in data."

The College of Charleston faculty, in the development of our undergraduate program, have enhanced the definition by explicitly articulating the contributing fields of Bayesian statistics, mathematical modeling, and computer programming.

It's about gleaning new knowledge from existing information. Until the 20th century, information was the critical input that helped people solve problems. In the 20th century information became the solution. Information itself was the product of commerce – from science to the entertainment industry. And now, at the beginning of the 21st century, information has become a problem – there seems to be no end to how much information can be generated, communicated and stored, often without intention or value.

Discovery Informatics does not purport to solve the information glut. Instead the aim is to find meaning in information that is otherwise not useful – or not as useful as it could be – because of its sheer volume. Discovery Informatics has already had success in transforming the way we do biology (as bioInformatics and genomics), in medicine (as medical Informatics), and in pharmacology (in drug Informatics and drug discovery).

The information revolution did not start with the computer. Gutenberg was responsible for that when it became possible to copy information for mass distribution. Today networks copy and transport information nearly without cost, with respect to paper-based publication. Working with information stored as bits rather than atoms has made all the difference. With the ease of information publication and transmission. Every person and computational device connected to the information grid can both publish and consume information. Discovery Informatics is a challenging but quantitative response to find added value by thinking in new ways made available only as a partnership between humans and machine.

The Roots of Discovery Informatics

Discovery Informatics is based on the following three broad disciplines:

Deductive Inference   |   Inductive Inference   |   Dynamic Inference


 

Deductive Inference:
Logic studies the rational principles involved in determining when certain statements follow necessarily from other premises.
— If I believe such-and-such is true, then what other things am I forced to believe as true?
— If I determine that something is false, what implications does this have on my other beliefs?

Proof Theory studies the way true statements are proved with an eye towards automating this process so that a computer can discover new truths based on logic.
— Which proofs require a spark of genius from a human and which proofs can be effectively discovered by a computer?
— How can we mechanize proofs and logical operations?

Computability studies problems of computation and decidability. It has been known since the 1930s that some precisely stated problems are undecidable, either by human or machine.
— Which problems are decidable, i.e., have solutions?
— Can we recognize those problems that will potentially yield solutions, thereby avoiding thorny problems with no solutions?
— For those problems that are decidable, can we build algorithms that will solve them?

Inductive Inference:
Statistics studies means of inferring universal truths by looking only at examples.
— How can we extract and measure useful information from an entire population using random subsamples of that population and prior knowledge?
— How can we predict the outcome of an election with millions of voters by taking a random poll of 500 voters?
— How can we predict the effects of a new drug on all potential users by testing it on a carefully selected few?
— How can we find genetic markers for serious diseases?

Learning Theory studies how we and machines learn to predict, simulate, and understand the input/output patterns of a complicated "black-box" mechanisms by seeing training examples.
— How can we design computers to recognize someone's face or speech?
— Which features of a dataset do we extract that will prove useful for learning?
— For example, which features of the human face do we use to detect gender?

Very young children learn to recognize gender very accurately, whereas today's most sophisticated computers do a relatively poor job. But, the vast majority of mail in the U.S. is sorted by computers finding and then reading the zip codes, many of which are handwritten. The error rate of these computer scanners is lower than human scanners.

Information Theory studies effective ways to encode all sorts of information and knowledge suitable for analysis, storage, searching, and communicating.
— How can images, text, and audio files be encoded effectively?
— How much can you compress a piece of information like a picture and still retain its essential features?
— How do we retrieve information that has been corrupted by noise?
— How do we measure the complexity of information and knowledge?

Artificial Intelligence explores working definitions of intelligence and ways of building machines that are intelligent.
— How do we represent knowledge?
— How do we reason under uncertainty?

Researchers in all fields are very good at forming hypotheses.
— To what extent can the process of hypothesis formation and goal setting be automated?
— How do we recognize if and when a machine is displaying intelligence?
— When and how can we trust intelligent machines?

This field remains controversial and often devolves into philosophical debate.
 
Dynamic Inference
Computational Complexity studies how hard it is to find and verify solutions to specific decidable problems.
— Can a solution be executed in a reasonable amount of time using a reasonable amount of memory?
— How difficult is it to simulate a complicated process?
— Can we engineer sufficiently fast algorithms to control a process or analyze a data stream in real time?
— How can we recognize effectively solvable problems and questions?

Complex Systems studies how seemingly simple rules can display exceedingly complicated emergent behavior over time.
— How can we recognize when complicated behavior or data is the result of simply understandable root causes having evolved over time?
— How can we reconstruct these simple dynamical rules?
— Having these rules in hand, can we predict or understand the resulting emergent behavior from these rules without the need to observe the system over time?

The quintessential example of a complex adaptive system is how species evolve over eons of time.

Evolutionary Algorithms studies the feasibility and effectiveness of having algorithmic code evolve in much the same way as species evolve. Different pieces of algorithmic code mutate, combine, and compete with respect to some fitness criteria, hopefully evolving into an effective and otherwise undiscovered algorithm for solving interesting problems, analyzing data, or discovering important knowledge.
— Given a specific goal, how do we start this algorithmic evolutionary process in such a way as to be successful?
— Can evolutionary algorithms discover new knowledge, and if so, can we understand and leverage the resulting algorithmic code?

There exist important successes of evolutionary algorithms in computer and electrical circuit design, with these algorithms rediscovering existing patents and most surprisingly issuing new patents making the older ones obsolete.


Learn more:

      Discovery Informatics FAQs | About the Degree Program









Discovery animation
   
spacer

DI Home   About DI   About the Program   Contact Us

Copyright© All Rights Reserved    Website design by bleezardes@cofc.edu