Computer Science vs. Information Studies vs. Information Systems (With a Little Informatics)

Article first published as Computer Science vs. Information Studies vs. Information Systems (with a Little Informatics) on Technorati.

There are possibly a half-dozen major, intertwining domains in the IT field, and sometimes a little deciphering is necessary when assessing the education possibilities. When it comes to assessing how things stack up, a little cheat-sheet can be handy.

Student on laptopComputers are still calculators. Even the self-driving Google Car still functions on a mathematical basis. Behind the artificial intelligence that allows Google’s vehicle to navigate in traffic is a complicated system of calculations. This system was developed by the best and brightest of the info-tech sector. Whether you are getting ready to become the next Pierre Omidyar or you are hiring your next Mad Scientist, you need to know the difference between the numerous IT academic programs that exist. This requires a quick trip through the information science domains, starting with the fore-father: computer science.

Computer science is the grand-pappy of the Information Revolution. Heavy in math and theory, computer scientists solidly learn the foundation of computer methods. Comp-sci covers everything: networking, programming, theory, practice. Computer science is the font, and good students should be able to quickly learn most any technical task. Bad CS students, on the other hand, know just enough to forge havoc. Computer science is by far the most difficult field, and probably has produced more liberal arts majors than any other scientific field besides medicine.

Information systems has a dual identity. One path of IS deals with technical management. The other part of the IS field deals with expert systems, artificial intelligence, databases, informatics, and all those enterprise-level developments that you hear about. Certain schools have even more specialized topics, and many universities form partnerships with corporate R&D departments. Information systems comes in as a strong contender, and is a highly competitive program for IT positions after graduation.Brass robot running

Software engineering deals both with software development, system engineering, and project management. At the core level, software engineering teaches advanced methods to handle software application development. Kind of like computer science on steroids, this is much more an engineering field than the other, lighter information technology domains.

Information studies is the field that librarians hail from. Information studies deals with both the end-user usage and organization of information. How information is used, reproduced, analyzed, and integrated: all of this is part of the information studies domain. Beyond strict skirts and buns, librarians now practice web design, various database usage, and the perfection of specialized query techniques. Also, teaching the basics of smart computer literacy is a foundation of information studies; after all, educated citizens are competent citizens.

The informatic domains are the newest players on the field. Informatics can roughly be described as the intersection of technology, people, and business. Not quite hard-core IT, far from liberal arts, informatics blends a mishmash of different technical concepts together. While informatic work is common and necessary with every aspect of technology, the field itself has barely become established. Some schools have sub-specialties in community informatics, medical informatics, or information management. Effectively, all of these informatics domains are siblings, and they are all distant cousins of their grandfather, computer science.

There is a plethora of computer science domains to pick from. Importantly, the technical domains are not exclusive – many information studies majors can function as a Tier 1 help-desk technician with minimal training, and all IS graduates have heavy experience with database administration and system design. Computer science can lead directly into software engineering, and good informatics programs offer enough technical education to meet criteria for most any position. If you are hiring, assess by grades and graduate-level work. If you are choosing a program, honestly pick by interest and ability. Be willing to learn more, and be dedicated to learning/discovering/developing the next big thing. Finally, I ask you to carry this imperative lesson away: Technology is skill, not luck.

Images from Free Digital Photos, photographers are Graur Codrin (student) and Luigi Diamanti (robot).

Read more: http://technorati.com/business/article/computer-science-vs-information-studies-vs/#ixzz1Fc4MBUNV

Posted in Education, Information Technology, Wordplay and Commentary | Tagged , , , , , , , | Leave a comment

China’s International Hacking Exploit

A quick not-so-technical bit about April 8, 2010. One of China’s smaller Internet providers tricked online traffic into flowing through China’s ISP system. Quite a few Federal/high-security computer networks also happened to be rerouted. If you didn’t know, traffic can be recorded and analyzed by ISPs. People are divided on whether it was an accident.

Now the technical bits:

During the hijacking of international Web traffic in 2010, China spoofed 37,000 fake IP addresses and routed a huge amount of traffic via a central Chinese backbone ISP. Notably, this was done via simple methods that almost any 3-rd tier ISP could perform. The issue is the BGP table, otherwise known as the Border Gateway Protocol. The BGP determines the routing paths for larger ISPs, which are mirrored with a number of like-sized ISPs, as well as ISP providers up and down the chain.

Methods to prevent BGP hijacking are complex. A simple, Symantec-style prevention would block all traffic from the hijacking ISP after the fact. This is essentially blackballing the offending ISP provider, whom would be unable to receive traffic from the blocker. While not practical for larger ISPs, this would be an effective deterrent to smaller ISPs.

Unfortunately, the hijacking ISP from China was a smaller ISP which spoofed its ISP addresses via the larger ISP, thus almost acting like a virus – injecting a payload and then gleaning from the effect, while riding on the massive strength of China’s backbone ISP. Even if the smaller ISP was banned from traffic, this would not prevent hijackings in the future.

Another, more complicated method would be to maintain a warning system within the IPv6 load sent from secure, trusted servers. This would act as a countermeasure or antibody, as the same mechanism that performs ISP hijacking would be used to prevent it. While the mathematics is beyond me, a modified algorithm could detect sudden switches in traffic either to a specific known ISP block (the block of blackballed ISPs mentioned in solution 1, for example), or via a sudden change in server distances where standard server routes should theoretically maintain a general range. While there are ways to defeat both of these measures (for example, penetrating and utilizing trusted servers to corrupt safety messages, or altering routes so that the required distance appears to match the known range), all three of these solutions could be implemented with various amounts of haste.

For further reference, see the article at Arbor Networks: http://asert.arbornetworks.com/2010/11/additional-discussion-of-the-april-china-bgp-hijack-incident/

For a quick summary of the situation, see Computerworld’s article. Also details the political fallout from the debacle: http://www.computerworld.com/s/article/9197019/Update_Report_sounds_alarm_on_China_s_rerouting_of_U.S._Internet_traffic

Posted in Information Technology, The Cloud | Tagged , , , , , , , , , | Leave a comment

Many Eyes: Data Visualization Made Easy with IBM and Cognos

Many Eyes

Many Eyes, IBM's Cognos visualization system

Article first published as Many Eyes: Data Visualization with IBM and Cognos on Technorati.

Probably the little-known champion of easy data visualization is Many Eyes, a SaaS offered via Cognos and IBM. Hosted on IBM’s servers, Many Eyes allows pure data in either text or table format to be uploaded and processed into multiple visualization formats.

The Many Eyes IBM/Cognos system garners user input in either free-text or spreadsheet format, stores it on the server, and makes the data-set open to any user of Many Eyes. The tasks here are multiple: one major one is allowing free information access, another is visualization creation, a third is elicitation of user evaluation via comments and feedback, and yet another is gathering sources of free data for IBM’s corporate use. Essentially, using Many Eyes is an exchange of data for access to the visualization service.
Many Eyes is very clear in the visualization creation process, making each step clear: first data collection takes place, then data abstraction, then visual abstraction, and then visual delivery of the transmogrified data. Importantly, it is a static system; once the data is uploaded, it cannot be modified, only visualized in multiple ways.
IBM makes several solid assumptions about users, many of which are typical of visualization developments: they prefer a neutral or pastel color schema, they are inputting either free text or spreadsheet data, and they are making no mistakes with their data source. The latter is particular to information visualization, which is essentially a complex mathematical model.
IBM effectively has built an iterative system where users give feedback to other users, thus giving IBM a high-quality visualization analysis system. While IBM could conduct numerous investigations and evaluation studies in an empirically validated setting, they have chosen to crowd-source both their data sources and visualizations. This stands out as a particularly effective research model, acting as a type of rough capture system and aiding in not only metric analysis, but qualitative meta-data collection. As well, IBM is experimenting with pseudo-open source methods and community data collection.
Many Eyes will remain a pseudo-open source system until it is available in multiple places without the looming threat of restricted access. IBM ultimately retains control of the visualization system and data, unlike a user running Prefuse Flare, where the data can be manipulated on the client’s computer and disseminated without a corporate intermediary. An alternative to Many Eyes would be the software Tableau, or building an ontology via the Protégé system and then processing it in the Protégé module TGVizTab.
In total, Many Eyes is able to render more complex data, but requires a higher level of data sophistication and purity of data input. When you need to create stunning interactive pictures quickly, IBM’s Many Eyes is a great solution. After all, in the 21st century, knowledge is art.
Many thanks to Prof. Chaomei Chen at Drexel University‘s iSchool for his depth of visualization knowledge. Image courtesy of Closeup Eye by Petr Kratochvil at Public Domain Pictures.
Posted in Information Technology, Resource-a-rama, The Cloud | Tagged , , , , , , , , | Leave a comment

Health Information Technology: A Case-Based Reasoning System Proposal (Health Information Technology: Practical Concepts)

Cover of proposal

Healthcare IT

This is a white/technical paper on a cutting edge expert system, the Health Information Technology Development Resource (HITdr). It outlines the concepts involved with problem-solution pairings, and employs an inference mechanism that uses similarity-based matching algorithms to select recommended solutions given user input. The project develops a knowledge-based system that serves as an online development resource of “lessons learned” for health information technology system developers. Further, an ontology is defined to support the consistent representation of knowledge within the system as well as to enhance information retrieval.

A system is proposed that will provide access to the invaluable knowledge gained through actual EHR implementation experiences. Given the unstructured nature of the problem coupled with the existence of prior knowledge scenarios and domain-specific concepts, a case-based reasoning (CBR) system supported by a semantic framework (i.e., an ontology) is recommended. Not knowing what might work best in a given context – the unknown – makes it difficult for system developers to propose and justify new technologies. Finding a way to avoid “reinventing the wheel” with each new implementation is an important step towards reducing both uncertainty and cost.

Posted in Information Technology, Resource-a-rama | Tagged , , , , , , , | 1 Comment

Cracked Seed

Article first published as Cracked Seed on Technorati.
When I lived on the Mainland, there was a shop that carried unique foods. One of my family’s favorites was small preserved cherries, but they were rarely in stock.

These fruits had not been pickled, but instead seasoned and dehydrated. Shriveled and puckered, they looked like cherry-raisins. My wife told me that these preserved cherries are Hawaiian cracked seed, and that we could buy them by the pound on Oahu.

Cracked seed is a category of snack food or treats. One sub-category is flavored sugary candy, but the majority of cracked seed are spiced dried fruits or preserved seafood. As Hawaii‘s government reported that the state imports 85% of the food eaten, food preservation is a critical step to using food properly. To meet this need, cracked seed preserves foodstuffs for later use.

The flavor profile tends to be sour and/or tart, even with gummi candy and dried papaya (arguably one of the sweeter fruits). Typical cracked seed are wasabi peas, dried scallops, dehydrated cuttlefish, li hing mui, and my beloved preserved cherries. You learn to gnaw around the seed, which is typically retained inside the fruit.

Li hing mui, an intense tart preserved plumCracked seed is not for everyone. An unnamed relative actually described the preserved cherries we sent him as ‘gross’, and my sister had less than favorable reviews for some of the items in her birthday package.

Heed this warning: cracked seed can be an acquired taste. Some people may prefer them to candies, but the flavor profile sharply veers away from the American palate, with a dominance of savory, tart, and bitter. Sweet is sometimes the focus, but the typical sugary candy is difficult to find in a cracked seed shop; most corn-sugar candies have been dusted with ground li hing mui, a intense dried plum that rivels alum powder for puckering ability.

Notably, cracked seed is rarely available on the Mainland, and can be costly outside of the Hawaii islands. Locating a reliable source of these delicacies can be an adventure in itself.

The best place to buy is in Hawaii itself. The cracked seed trade is booming, with entire shops that stock nothing but shelf upon shelf of the savory treats. If you want to really amaze your friends with something unique from Hawaii, bring some home.

Read more: http://technorati.com/lifestyle/travel/article/cracked-seed/#ixzz14lhHbZXE

Posted in Uncategorized | Tagged , , , , , , , , , , , , , , | Leave a comment

The Deep Web: The Internet’s Obscure Data Mine

Internet Map. Ninian Smart predicts global com...

Image via Wikipedia

Article first published as The Deep Web: The Internet’s Obscure Data Mine on Technorati

Deep Web. Invisible Web. The Hidden Internet. Every knowledge professional, from librarian to data mining specialist, should know that the Deep Web exists.

The term “Deep Web” refers to data and resources that are not accessed by typical Googling. Examples include university-hosted databases, organizational Wikis, topic-specific forums, and other volatile locations difficult to index with typical bots and spiders. Notably, some of these massive websites can be very information-intensive – and also nigh-impossible to crawl via current public methods.

The Invisible Web is the special part of the web that provides convenient and direct access to data. This hidden information is typically open-source and requires a higher level of user filtering than normal usage of Google or Bing. Often, interpreting the results requires a combination of technical skills (for understanding) and research skills (for locating).

Corporations and fast-moving organizations can especially benefit from delving into the refined data-streams of the more major and less volatile resources. Non-advertised, often scientific, and tucked away in a digital corner, Deep Web resources are highly prized for their low data:noise ratio.

The technical resources that the Invisible Web covers make it vital for hard-core research. Deep Web materials include the National Institute of Health’s extensive libraries, Google Patents, open-source community blogs and Wikis, Metafilter‘s rambling yet insightful community, and the millions of data-intensive sites that have never been fully indexed by a search engine. Good library science programs intensively teach about this hidden gem, with the caveat that it is fruitful but slow to work with.

The Deep Web is a key foundation of the visible Web that Yahoo!, Bing, and Technorati allow us to access. Assessing and incorporating the information takes some serious spelunking, but the depths yield beautiful (and useful) rewards.

Images from Public Domain Pictures.

Read more: http://technorati.com/technology/article/the-deep-web-the-internets-obscure/#ixzz14leYqSzJ

Posted in Resource-a-rama, The Cloud | Tagged , , , , , , , , , , , | Leave a comment

Think Librarians Are Hot? Not So Fast

Description: Southwest Collections/Special Col...

Image via Wikipedia

Article first published as Think Librarians Are Hot? Not So Fast on Technorati.

‘Librarian’ may not really deserve a place on the top 50 career choices of 2011. Our nation’s geek love is driving people to love books and love libraries. It even coerces people to don glasses, wear buns, and become librarians.

All of this has some supposedly economic basis. For example, the Bureau of Labor Statistics declares that librarians are a hot commodity: “job prospects are expected to be favorable”, with many opportunities due to retirement. Is this hype? Probably. Unemployment, part-time work, being forced to work shelving books, and a limited career path face the would-be professional librarian.

Think of library science as technology meets liberal arts, with a healthy dose of SQL, databases, and human-computer interaction coursework. The education can be challenging, as library science is really a soft computer science degree.

There are 57 Master’s level programs accredited by the American Library Association, churning out a total of 6,500 new Master’s of Library Science (MLIS) graduates per year. Libraries require this Master’s degree for almost every public and school library position. Even more astounding, academic libraries typically require two Master’s degrees, the MLIS and another Master’s in a subject specialty.

These 6,000 new librarians then run into the reality of finding positions actually within libraries. Part time work, work below the educational level, an informal apprenticeship program, and other issues face would-be librarians. Many positions prefer to hire candidates who have library experience, a Catch-22 that permeates the system. Often “non-professional” work is required before cinching down a part-time position, often without benefits. Note: the broad term “non-professional” means anything from archivist to book-shelving. Another key element is that the majority of job openings are going to be due to retirement, which means that applicants are waiting for senior staff to retire in a heavy recession.

By the Bureau of Labor Statistics, median earnings are about $52,000. At one Oregon public library, the Librarian II category topped out at $61,000 per year. In order to get to this position, librarians must have roughly 15-18 years of experience. This should serve to give an idea about typical wages, but on the up-side, most libraries are government-run – with an accompanying plush benefit package.

By Libraryjournal.com’s 2009 survey of the library field, only 70% of MLIS graduates reported full-time work post-graduation, and unemployment post-graduation was almost 6%. Near one out of five (18%) of MLIS graduates work part-time, and 13.5% decided to work in those infamous non-professional positions. What does this mean? It means that would-be librarians may do better to choose something more technical and/or develop a wide skill set to compete for scarce positions. Library graduates have nearly double the national average unemployment rate of Summer 2010 (approx. 15%) once underemployment is factored in .

Obviously, libraries have an up-side. A safe career path, decent working hours, and a chance to help others make the top of the list. Many libraries are organized labor employers, and the work offers variety and intellectual stimulation. Libraries typically welcome career changers with families and responsibilities, and the field is appealing as a trendy career choice. Many programs emphasize technical skills, which are a definite asset in any workplace. While the MLIS is a great education, librarianship itself may not warrant being on the top career moves of 2011: the actual work is scant, part-time work is common, and you may end up shelving books for five years.

Posted in Education, Resource-a-rama, Wordplay and Commentary | Tagged , , , , , , , , , , | 1 Comment

Want to Know What Really Makes Us Human? Better Know Your SQL

Article first published as Want to Know What Really Makes Us Human? Better Know Your SQL on Technorati.

Genetic research is hot stuff. From homo sapiens to drosophila melanogaster, everyone wants to know what life is really made of.

All that research data has to go somewhere, and often it is placed straight into a database, stuck somewhere, and (the theory is) accessed by men with lab coats and clipboards.

Genoma Drosophila melanogaster

Image via Wikipedia

There is a different truth: genetic researchers apparently are some of the original open-source pioneers. From the Online Mendelian Inheritance in Man Database (OMIM), and Entrez’s numerous genome codes, to the genome of Illumina Corporation’s CEO, this data is available for public use. The secret: you better know your SQL to access it.

The OMIM database is a trove of data about genetic illnesses and vulnerabilities. Unlike Entrez or other, more complex databases, it can be searched by keyword, including diagnoses. OMIM pulls data from several other major genetic databases and compiles the results.

Thankfully, it also cites the location of the research in case you want to know, for example, the location of the genetic vulnerability for schizophrenia (note, there are several possible culprits).

Entrez is technically the ‘life sciences search engine’ (read: database collection) that is part of the National Center for Biotechnological Information (NCBI). Entrez hosts open-access databases for all things genetic. Some of these are accessible by keyword searching, others through various forms of modified SQL. Be warned, it is imperative that users have SQL experience past the surface level. Genetic research is not easily understood; thankfully some information can still be gleaned by basic users.

Lastly, the not-so-open-source data: Illumina‘s human genomes (two data sets). Hosted via Amazon’s Web Suite, there is a nominal fee for using these databases. Amazon charges per GB of data transfer, as Illumina’s information is in the Cloud.

Illumina does not allow mucking around with keywords – it is SQL or nothing with this data. Interestingly, one entire genome is Illumina’s CEO, Jay Flatley. Yes, Illumina’s CEO lets any researcher with $2 play with his evolutionary code. It takes a dedicated man to publish his entire genetic makeup.

Many of OMIM’s human genome databases are available for download via FTP. The OMIM databases can also be mapped to users’ databases via XML, which is vital for a smooth transfer. Entrez has a whole utility suite for making remote queries and downloading results, but setting all this up requires some finesse.

Theoretically, the Illumina data could be downloaded from Amazon and mapped, and the cost would be fairly minor. This makes Mr. Flatley technically immortal, because his genetic code is now open-source for eternity.

There is really nothing preventing access to genetic materials, even the H1N1 influenza virus. The crux is not the data, but knowing how to use it. Genetic scientists still have this domain locked tight. Still, if you want to research genetic illnesses, practice your SQL with some novel resources, or download genomic data, it is absolutely possible. In twenty years, medicine may rely on certain diagnoses requiring genetic tests.

In that future, Entrez, OMIM, and even Illumina may slide into the mainstream Internet search collective. Until then, if you want to access the human genome, you had better know your SQL.

Posted in Genetics, Resource-a-rama | Tagged , , , , , , , , , , , , , , , , , , , , , | Leave a comment

Jing: Viral Powerpoints made easy!

Have you ever wanted to place a screenshot directly onto Twitter? Do you want to record your screen and stick the video on Facebook? Even though you may regret doing this later in life, TechSmith has made this possible with Jing, their simple-to-use screen/audio capture and upload program.

Continue reading

Posted in Media Sharing, Online Storage, The Cloud | Tagged , , , , , , , , , , , , , , | Leave a comment

Gliffy: Cloud-based Systems Design Software

Gliffy is the handiest cloud-based information systems design program to date. Much beloved by people who use cloud services, it combines several sets of design templates with a drag-and-drop format. Features include downloading, exporting, and the ever-popular ability to share with other users. It is this last element that sets Gliffy apart from desktop design programs, as typical design software requires send various drafts to and from designers, or hosting on a company server. Gliffy eliminates this troublesome step with their cloud-based service.

Continue reading

Posted in The Cloud | Tagged , , , , , , , , , , , , , , , , , , | 2 Comments