[ecoop-info] Call for Papers -- UDME 2007 @ OOPSLA 2007 [30-10&11]

Mohamed Fayad m.fayad at sjsu.edu
Tue Aug 21 11:46:23 CEST 2007


Hello,  "Sorry for Multiple Copies"  "PLEASE DISRIBUTE To Your Mailing 
Lists"  Thank you. Cheers,  M. Fayad
________________________________

The First International Workshop on 
Unified Data Mining Engine: Addressing Challenges 

UDME 2007

Call for Papers

Montréal,  Canada, October 22, 2007
 (in conjunction with OOPSLA 2007)

http://www.oopsla.org/oopsla2007 (OOPSLA 2007 Link) 
http://www.oopsla.org/oopsla2007/index.php?page=sub/&id=160 (Workshop Link 
1)
http://www.engr.sjsu.edu/~fayad/workshops/UDME07 (Workshop Link 2)
http://www.vrlsoft.com/workshops/UDME07 (Workshop Link 3)

INTRODUCTION

Data mining is the discovery of knowledge of analyzing enormous set of 
data, by extracting the meaning of the data and then predicting the future 
trends. Data mining helps us to find out secret information from large 
databases, and also helps companies to take sound decisions, based on 
knowledge and information. 

If we closely take a look into any data-mining tool, we can see there are 
some common core logic, which are independent of the data and the 
applications, but most of existing implementations try to ignore that fact 
and concentrate on the specific problem, in that way the tool becomes 
limited to only to a particular set of data for specific application.

Data mining is also finding interesting patterns in data. The main 
challenge of any data-mining engine is how to apply different algorithms 
or different techniques, on different set of data, to find interesting 
pattern, which is very useful to business. It is extremely difficult to 
come with some standard way of analyzing the data. The enormous volume and 
the complexity of the data make it impossible to run same algorithms on 
different dataset. Nowadays, there are different vendors, who are trying 
to solve this problem, but mostly they support a subset of different 
algorithms. None of them has come up with any stable engine, which can 
work in any data set and in any domain.

In the last decade, the improvement in storage and CPU speed has created a 
huge opportunity for different data mining application, ranging from CRM 
to medical health care application. The evolution of data mining is shown 
in table 1. 

Now it is very difficult to develop a single application, which can take 
care all of these problems. It?s a dream even to think of an application, 
which can iterate through any data and will find pattern. Data mining also 
deals with useful pattern, not just patterns, now whether a pattern is 
useful or not, depends on the context where it is usually applied. Present 
day tools depend solely on the expert about what kind of algorithms to 
apply, and how to analyze the output, because most of them are generic, 
and there is no context specific logic is attached to the application.

Here is s summary of the problems that we face today in the existing data 
mining tools 
1.      Difficult to use? Existing data mining tools try to cover all 
different data mining applications, thus it becomes very difficult to 
configure and run.
2.      Needs Expert to run the tool ? No domain or problem specific logic 
is tied with the tool, therefore needs expert to run the to tool and 
analyze the result
3.      Difficult to add new functionality - Because of the size and 
complexity of each tool, it is very difficult to add any new feature.
4.      Difficult to interface -  There is no way  those algorithms 
developed by some other companies, can be integrated with the tool easily
5.      Short Lifetime -  There is no stable component in the tool  and 
with time the tool become obsolete, as new tools take the market, changing 
the exiting tool to incorporate new feature is difficult and require lot 
of changes.
6.      Limited Number of algorithms ? Existing tool only provide limited 
number of algorithm and sometime use of multiple algorithms is very 
limited.
7.      Need lot of resources: Existing tools are not optimized for any 
specific application, therefore they need lot of resources, such as 
runtime memory, hard disk etc.


Thus, this workshop is driven forward by three main questions. First, ?how 
can we develop a unified data mining engine {UDME)?? Second, ?what kind of 
technologies and tools to  build such an Engine?? and third, ?how can we 
overcome the existing problems?? 

OBJECTIVE AND MOTIVATION


Building such an engine is not an easy exercise, specifically, when 
several factors can undermine their quality success, such as cost, time, 
and lack of systematic approaches. We would like to architect and develop 
a Unified Data Mining Engine (UDME), that has the some or all of the 
following properties: 
1.      Ease of use? Multiple tools can be developed easily by focusing on 
specific problems, because they all can share the core services, that are 
provided by the UDME.
2.      No Need of Expert to run the tool ? Domain specific knowledge such 
as verification, selection of tool etc, can be implemented in the tool 
itself, while developing the tool.
3.      Easy to add new functionality - ? The application specific logic 
should be separate from the core logic, therefore new application specific 
functionality can be added easily, without making any change in the core 
logic.
4.      Easy to interface -  The design should be based on system of 
independent patterns, they can be developed by 3rd party vendors.
5.      Long Lifetime -  The engine should be based on stable core logic, 
which has a long lifetime, the application logic should be loosely 
connected which can change over time.
6.      Multiple algorithms ? The engine must support any number of 
algorithms.
7.      Fewer resources: The proposed engine should be developed by 
connecting several patterns or components. Depending on the application, a 
 domain the engine can use patterns or components, which are necessary 
therefore it needs less resources compare to existing tools.
8.      Stable: The engine should be stable over time, and provide a 
simple way to apply different data mining and data analysis algorithms on 
different sets of data in any domain.
9.      Isolation of Application logic: We must also isolate the stable 
knowledge from any application specific logic, therefore different 
applications can use the same core knowledge, which need not to be 
changed.
10.     Minimum Maintenance Cost ? Maintenance cost of such an engine 
should be very minimal. 


WORKSHOP CHALLENGES

The workshop will address the unified data mining engine challenges and 
debate several issues  that are related to the following questions. We 
also want researchers, framework developers, and application developers to 
discuss and debate the following questions related to:
 
I.      UDME Architecture
a.      What is the best approach for building such an engine?
b.      What are the bases of creating the engine architecture? 
c.      Are there any guidelines, methodologies, and/or processes for an 
engine architecture creation and development?
d.      What are the components of the unified data mining engine 
architecture?
e.      What kind of patterns or components that appear in UDME ?
f.      Show how your engine architecture meets the above UDME properties.

II.     UDME Development
a.      What is the ultimate way to develop such an engine? 
b.      What are the techniques and tools for developing such an engine?
c.      Show how to extend your engine to the new application logics?

More information will be available at:
http://www.oopsla.org/oopsla2007 (OOPSLA 2007 Link) 
http://www.oopsla.org/oopsla2007/index.php?page=sub/&id=160 (Workshop Link 
1)
http://www.engr.sjsu.edu/~fayad/workshops/UDME07 (Workshop Link 2)
http://www.vrlsoft.com/workshops/UDME07 (Workshop Link 3)

SUBMISSIONS

Developers and programmers, who are interested in participating in the 
workshop, are requested to submit a short position paper (3-5 pages), or 
regular workshop paper (limited to 6-15 pages, double spaced, including 
figures) by representing views and experiences that are relevant to the 
given discussion topic. The title page must include a maximum 150-word 
abstract, five keywords, full mailing address, e-mail address, phone 
number, fax number, and a designated contact author. Workshop papers will 
be selected depending on their originality, quality and relevance to the 
workshop.  All submitted papers will also be evaluated according to their 
originality, significance, correctness, presentation and relevance. Papers 
should be submitted electronically to the chair.  Please follow the 
instructions that are provided on the web page. Camera Ready manuscripts 
must be submitted following ACM SIGPLAN conference proceedings style and 
guidelines. We also encourage authors to present novel and fresh ideas, 
critiques of existing work, and practical studies. 

Each accepted workshop paper must be presented in the person, either by 
the author or by one of the co-authors.  To foster and promote lively 
discussions, authors are encouraged to present open ended questions and 
one or two main statements for the purpose of discussion at the workshop. 
Submissions must be made either in MS-Word or RTF formats (Please, DO NOT 
compress files).

Depending on the total number and spread of contributions, the scope may 
be further narrowed down  to ensure an effective communication and 
information sharing session. Accepted position papers will be distributed 
to the participants, just before the workshop and will be made generally 
available through the WWW and FTP.   Accepted papers will also be 
published in the Workshop Proceedings. At least one of the authors of each 
accepted paper must register, as a full delegate in the workshop. Selected 
papers will be published in one of the future issues of the online 
International Journal Of Patterns (IJOP), www.ijop.org and/or 
International Journal of Software Architectures (IJSA), www.ijsa.net

IPARTICIPATION

People who are interested in participating in the workshop, without making 
any submissions are requested to fill out the participation form and 
e-mail to any of the workshop chairs. 
-------------------------------------------------
PARTICIPATION FORM:
Name and Affiliation:
Position: 
Address: 
E-mail:
URL:
Areas of interest:
Reasons for Attending?
-------------------------------------------------
Please note that registration is absolutely mandatory, in order to 
participate in the workshop.  An early registration discount is made 
available for all desired participants.  An overhead projector and a 
flipchart will also be made available to all participants. 

For more information please visit any of the following websites:

http://www.oopsla.org/oopsla2007 (OOPSLA 2007 Link) 
http://www.oopsla.org/oopsla2007/index.php?page=sub/&id=160 (Workshop Link 
1)
http://www.engr.sjsu.edu/~fayad/workshops/UDME07 (Workshop Link 2)
http://www.vrlsoft.com/workshops/UDME07 (Workshop Link 3)

You may also contact the organizers, either by e mail or by phone.

WORKSHOP AGENDA

1. Welcome and introduction of participants. The organizers will first 
provide  a short overview of all open issues, and also of the main 
arguments arising out of the position papers. (Estimated time: 20-30 
minutes)

2. Selected authors (who?ll be representing the main trends) will be 
allotted  20 minutes, to explain, how their position relates to other 
positions, and what each one of them sees as the three major issues. We 
are expecting about 5-10 position papers in this session.  (Estimated 
time: 120-130 minutes) 

3. The organizers will also propose an identification process of the major 
issues, and the participants will then discuss, choose and select what 
they perceive are the hottest issues to be examined and analyzed. 
(Estimated time: 10-15 minutes)

4. The participants will work for 70-95 minutes in small groups, with a 
designated moderator assigned for leading each group. The groups will then 
individually deal with two identified,  but different hot issues, and will 
produce a summary note in the form of points and counterpoints, showing 
either how several views are irreducibly opposed or how they are 
complementary.  The total number of groups will depend mainly on the 
number of participants and issues selected; ideally there should be 3-5 
people in each group. (Estimated time: 60-70 minutes) 

5. Each group will be provided10-15 minutes to present its findings and 
inferences to the workshop.   A closing discussion will soon follow. The 
workshop report will be composed on the basis of these findings, and will 
include a clear cut agenda for future exploration and cooperation; this 
will be made available through the WWW and FTP. (Estimated time: 50-60 
minutes for five teams) 

(Total estimated time: 285-315 minutes, i.e. about five hours +/- 15 
minutes; lunch and breaks are not included.) 


IMPORTANT DATES

IMPORTANT DATES -- Will be updated based on acceptance process.

Submission deadline             September 14, 2007
Acceptance notification         September 30, 2007
Camera-ready paper due          October 10, 2007
Workshop date:                  October 22, 2007
Conference begins:                      October 21, 2007

ORGANIZERS

DR. M.E. FAYAD  (CHAIR)
Professor of Computer Engineering
Computer Engineering Dept., College of Engineering
San José State University
One Washington Square, San José, CA 95192-0180
Ph: (408) 924-7364, Fax: (408) 924-4153
E-mail: m.fayad at sjsu.edu, mefayad at gmail.com
http://www.engr.sjsu.edu/fayad

DR. TAREK HELMY (CO-CHAIR)
College of computer science and engineering, 
Department of Information and Computer Science, 
King Fahd University of Petroleum and Minerals, 
Dhahran 31261, Mail Box. 413, Saudi Arabia. 
Ph: 9663-860-1967 (Office)
E-mail: helmy at ccse.kfupm.edu.sa

DR. RAMI BAHSOON (CO-CHAIR)
School of Engineering and Applied Science
Aston University in Birmingham, Birmingham B4 7ET, United Kingdom
office: Main Building, Second Floor, MB 213E
Ph:  +44 (0) 121 204 3464
fax:  +44(0) 121 204 3681
URL: http://www-users.aston.ac.uk/~bahsoonr/index.htm

PROFESSOR DILIP PATEL (CO-CHAIR)
Faculty of Business, Computing and Information Management
London South Bank University
103 Borough Road
London SE1 0AA, United Kingdom
TEL: +44 (0)20 7815 7429

SOMENATH DAS (CO-CHAIR)
eBay, Inc.
2211 North First Street
San Jose, CA 95131, USA
Ph: 408 967 4151
E-mail: sodas at ebay.com

EDUARDO M. SEGURA (CO-CHAIR)
vrlSoft, Inc.
2065 Martin Ave., Suite 103
Santa Clara, CA 95050-2707
Phone/Fax: (408) 654-8972
E-mail: esegura at vrlsoft.com, eduardo.segura at sjsu.edu
http://www.vrlsoft.com

PROGRAM COMMITTEE

Rami Bahsoon, Aston University in Birmingham, United Kingdom
Rogerio Atem de Carvalho,  Federal Center for Technological Education of 
Campos, Brazil
Chia-Chu Chiang, University of Arkansas, Little Rock, USA
Issam Wajih Damaj, Dhofar University, Salalah, Sultanate of Oman
Somenath Das, eBay, Inc., USA
Dilip Patel, London South Bank University, United Kingdom 
Jurgen Dix, Clausthal University of Technology, Germany
M.E. Fayad,  San Jose State University and vrlSoft, Inc, Silicon Valley, 
USA
Jaafar Gaber, Université de Technologie de Belfort-Montbéliard, France 
Rosario Girardi,  Federal University of Maranhão, São Luís, Brasil
Dr. Tarek Helmy,  King Fahd University of Petroleum and Minerals, Dhahran, 
Saudi Arabia
Hoda Hosny, The American University in Cairo, Egypt
A. Kannammal, Coimbatore Institute of Technology, TamilNadu, India
Mohamed-Khireddine Kholladi, University of Constantine, France
Dae-Kyoo Kim, Oakland University, USA
Roger (Buzz) King,  University of Colorado, Boulder CO, USA
Jianzhi Li, De Montfort University, United Kingdom
Nashat Mansour, Lebanese American University, Lebanon 
Tokuro Matsuo,  Yamagata University, Japan
Srini Ramaswamy,  University of Arkansas, Little Rock, USA
Miguel Garre Rubio, Universidad de Alcalá, Madrid, Spain
Edaurdo M. Segura,  San Jose State University and vrlSoft, Inc, Silicon 
Valley, USA
Jaroslav Zendulka,  Brno University of Technology, Czech Republic
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.aito.org/pipermail/ecoop-info/attachments/20070821/24c3ba95/attachment-0001.htm 


More information about the ecoop-info mailing list