Capturing Emergent Semantics from Social Annotation Systems

Benz, Dominik

🇬🇧

Dissertation

Zusammenfassung

🇬🇧

The ongoing growth of the World Wide Web, catalyzed by the increasing possibility of ubiquitous access via a variety of devices, continues to strengthen its role as our prevalent information and commmunication medium. However, although tools like search engines facilitate retrieval, the task of finally making sense of Web content is still often left to human interpretation. The vision of supporting both humans and machines in such knowledge-based activities led to the development of different systems which allow to structure Web resources by metadata annotations. Interestingly, two major approaches which gained a considerable amount of attention are addressing the problem from nearly opposite directions: On the one hand, the idea of the Semantic Web suggests to formalize the knowledge within a particular domain by means of the "top-down" approach of defining ontologies. On the other hand, Social Annotation Systems as part of the so-called Web 2.0 movement implement a "bottom-up" style of categorization using arbitrary keywords.
Experience as well as research in the characteristics of both systems has shown that their strengths and weaknesses seem to be inverse: While Social Annotation suffers from problems like, e. g., ambiguity or lack or precision, ontologies were especially designed to eliminate those. On the contrary, the latter suffer from a
knowledge acquisition bottleneck, which is successfully overcome by the large user populations of Social Annotation Systems. Instead of being regarded as competing paradigms, the obvious potential synergies from a combination of both motivated approaches to "bridge the gap" between them. These were fostered by the evidence of emergent semantics, i. e., the self-organized evolution of implicit conceptual structures, within Social Annotation data. While several techniques to exploit the emergent patterns were proposed, a systematic analysis - especially regarding paradigms from the field of ontology learning - is still largely missing. This also includes a deeper understanding of the circumstances which affect the evolution processes.
This work aims to address this gap by providing an in-depth study of methods and influencing factors to capture emergent semantics from Social Annotation Systems. We focus hereby on the acquisition of lexical semantics from the underlying networks of keywords, users and resources. Structured along different ontology learning tasks, we use a methodology of semantic grounding to characterize and evaluate the semantic relations captured by different methods. In all cases, our studies are based on datasets from several Social Annotation Systems.
Specifically, we first analyze semantic relatedness among keywords, and identify measures which detect different notions of relatedness. These constitute the input of concept learning algorithms, which focus then on the discovery of synonymous and ambiguous keywords. Hereby, we assess the usefulness of various clustering techniques. As a prerequisite to induce hierarchical relationships, our next step is to study measures which quantify the level of generality of a particular keyword. We find that comparatively simple measures can approximate the generality information encoded in reference taxonomies. These insights are used to inform the final task, namely the creation of concept hierarchies. For this purpose, generality-based algorithms exhibit advantages compared to clustering approaches.
In order to complement the identification of suitable methods to capture semantic structures, we analyze as a next step several factors which influence their emergence. Empirical evidence is provided that the amount of available data plays a crucial role for determining keyword meanings. From a different perspective, we examine pragmatic aspects by considering different annotation patterns among users. Based on a broad distinction between "categorizers" and "describers", we find that the latter produce more accurate results. This suggests a causal link between pragmatic and semantic aspects of keyword annotation. As a special kind of usage pattern, we then have a look at system abuse and spam. While observing a mixed picture, we suggest that an individual decision should be taken instead of disregarding spammers as a matter of principle.
Finally, we discuss a set of applications which operationalize the results of our studies for enhancing both Social Annotation and semantic systems. These comprise on the one hand tools which foster the emergence of semantics, and on the one hand applications which exploit the socially induced relations to
improve, e. g., searching, browsing, or user profiling facilities. In summary, the contributions of this work highlight viable methods and crucial aspects for designing enhanced knowledge-based services of a Social Semantic Web.

Sammlung(en)

Dissertationen (Wissensverarbeitung)

Zitieren

BibTex

@phdthesis{urn:nbn:de:hebis:34-2013022642523,
   author={Benz, Dominik},
   title={Capturing Emergent Semantics from Social Annotation Systems},
   school={Kassel, Universität, FB 16, Elektrotechnik/Informatik},
   month={02},
   year={2013}
}

0500 Oax
0501 Text $btxt$2rdacontent
0502 Computermedien $bc$2rdacarrier
1100 2013$n2013
1500 1/eng
2050 ##0##urn:nbn:de:hebis:34-2013022642523
3000 Benz, Dominik
4000 Capturing Emergent Semantics from Social Annotation Systems / Benz, Dominik
4030 
4060 Online-Ressource
4085 ##0##=u http://nbn-resolving.de/urn:nbn:de:hebis:34-2013022642523=x R
4204 \$dDissertation
4170 
5550 {{Wissensextraktion}}
5550 {{Ontologie <Wissensverarbeitung>}}
5550 {{Social Tagging}}
5550 {{Semantic Web}}
7136 ##0##urn:nbn:de:hebis:34-2013022642523

<resource xsi:schemaLocation="http://datacite.org/schema/kernel-2.2 http://schema.datacite.org/meta/kernel-2.2/metadata.xsd">
2013-02-26T08:27:06Z
2013-02-26T08:27:06Z
2013-02-26
urn:nbn:de:hebis:34-2013022642523
http://hdl.handle.net/123456789/2013022642523
eng
Urheberrechtlich gesch&uuml;tzt
https://rightsstatements.org/page/InC/1.0/
Ontology Learning
Semantics
Social Annotation Systems
Folksonomies
Knowledge Discovery
Social Web
Semantic Web
004
Capturing Emergent Semantics from Social Annotation Systems
Dissertation
The ongoing growth of the World Wide Web, catalyzed by the increasing possibility of ubiquitous access via a variety of devices, continues to strengthen its role as our prevalent information and commmunication medium. However, although tools like search engines facilitate retrieval, the task of finally making sense of Web content is still often left to human interpretation. The vision of supporting both humans and machines in such knowledge-based activities led to the development of different systems which allow to structure Web resources by metadata annotations. Interestingly, two major approaches which gained a considerable amount of attention are addressing the problem from nearly opposite directions: On the one hand, the idea of the Semantic Web suggests to formalize the knowledge within a particular domain by means of the "top-down" approach of defining ontologies. On the other hand, Social Annotation Systems as part of the so-called Web 2.0 movement implement a "bottom-up" style of categorization using arbitrary keywords.&#13;
Experience as well as research in the characteristics of both systems has shown that their strengths and weaknesses seem to be inverse: While Social Annotation suffers from problems like, e. g., ambiguity or lack or precision, ontologies were especially designed to eliminate those. On the contrary, the latter suffer from a&#13;
knowledge acquisition bottleneck, which is successfully overcome by the large user populations of Social Annotation Systems. Instead of being regarded as competing paradigms, the obvious potential synergies from a combination of both motivated approaches to "bridge the gap" between them. These were fostered by the evidence of emergent semantics, i. e., the self-organized evolution of implicit conceptual structures, within Social Annotation data. While several techniques to exploit the emergent patterns were proposed, a systematic analysis - especially regarding paradigms from the field of ontology learning - is still largely missing. This also includes a deeper understanding of the circumstances which affect the evolution processes.&#13;
This work aims to address this gap by providing an in-depth study of methods and influencing factors to capture emergent semantics from Social Annotation Systems. We focus hereby on the acquisition of lexical semantics from the underlying networks of keywords, users and resources. Structured along different ontology learning tasks, we use a methodology of semantic grounding to characterize and evaluate the semantic relations captured by different methods. In all cases, our studies are based on datasets from several Social Annotation Systems.&#13;
Specifically, we first analyze semantic relatedness among keywords, and identify measures which detect different notions of relatedness. These constitute the input of concept learning algorithms, which focus then on the discovery of synonymous and ambiguous keywords. Hereby, we assess the usefulness of various clustering techniques. As a prerequisite to induce hierarchical relationships, our next step is to study measures which quantify the level of generality of a particular keyword. We find that comparatively simple measures can approximate the generality information encoded in reference taxonomies. These insights are used to inform the final task, namely the creation of concept hierarchies. For this purpose, generality-based algorithms exhibit advantages compared to clustering approaches.&#13;
In order to complement the identification of suitable methods to capture semantic structures, we analyze as a next step several factors which influence their emergence. Empirical evidence is provided that the amount of available data plays a crucial role for determining keyword meanings. From a different perspective, we examine pragmatic aspects by considering different annotation patterns among users. Based on a broad distinction between "categorizers" and "describers", we find that the latter produce more accurate results. This suggests a causal link between pragmatic and semantic aspects of keyword annotation. As a special kind of usage pattern, we then have a look at system abuse and spam. While observing a mixed picture, we suggest that an individual decision should be taken instead of disregarding spammers as a matter of principle. &#13;
Finally, we discuss a set of applications which operationalize the results of our studies for enhancing both Social Annotation and semantic systems. These comprise on the one hand tools which foster the emergence of semantics, and on the one hand applications which exploit the socially induced relations to&#13;
improve, e. g., searching, browsing, or user profiling facilities. In summary, the contributions of this work highlight viable methods and crucial aspects for designing enhanced knowledge-based services of a Social Semantic Web.
open access
Benz, Dominik
Kassel, Universit&auml;t, FB 16, Elektrotechnik/Informatik
Stumme, Gerd (Prof. Dr.)
Cimiano, Philipp (Prof. Dr)
H.1.2
H.2.8
68P20
Wissensextraktion
Ontologie &lt;Wissensverarbeitung&gt;
Social Tagging
Semantic Web
2012-03-02
</resource>

Die folgenden Lizenzbestimmungen sind mit dieser Ressource verbunden:

:
Urheberrechtlich geschützt

Öffnen

Datum

Autor

Schlagwort

URI

Metadata