Global Internet Freedom

About The Workshop

According to the recent report produced by Freedom House (freedomhouse.org), an “independent watchdog organization dedicated to the expansion of freedom and democracy around the world”, political rights and civil liberties around the world deteriorated to their lowest point in more than a decade in 2017. Online manipulation and disinformation tactics played an important role in elections in at least 18 countries over the past year, including the United States (see Freedom House reports). Disinformation tactics contributed to a seventh consecutive year of overall decline in internet freedom, as did a rise in disruptions to mobile internet service and increases in physical and technical attacks on human rights defenders and independent media. A record number of governments have restricted mobile internet service for political or security reasons, often in areas populated by ethnic or religious minorities. The use of “fake news,” automated “bot” accounts, and other manipulation methods gained particular attention in the United States. While the country’s online environment remained generally free, it was troubled by a proliferation of fabricated news articles, divisive partisan vitriol, and aggressive harassment of many journalists, both during and after the presidential election campaign. Venezuela, the Philippines, and Turkey were among 30 countries where governments were found to employ armies of “opinion shapers” to spread government views, drive particular agendas, and counter government critics on social media. The advent of Internet as a mass media has had a profound effect on the way political agendas and ideological messages are spread to larger and larger audiences. Nowadays, social media and messaging apps, may be exploited not only by large institutions and governments, but also by small organizations or individuals to reach an audience of unprecedented size. Such strategies have been reported to have been used to influence voters' opinions in the U.S. 2016 elections and the referendum on Brexit. Such alleged consequences have inevitably attracted the attention of the large institutions, government and social media companies and induced them to search for counter measures. The number of governments attempting to control online discussions in this manner has risen each year since Freedom House began systematically tracking the phenomenon in 2009. Various barriers exist to prevent citizens of a large number of countries from accessing information in many countries around the world. Some involve infrastructural and economic barriers, others include violations of user rights such as surveillance, privacy and repercussions for online speech and activities such as imprisonment, extralegal harassment or cyberattacks. Yet another area is limits on content, which involves legal regulations on content, technical filtering and blocking websites, (self-)censorship. Large internet service providers (ISPs) are effective monopolies, and have the power to use NLP techniques to control information flow. Users are suspended or banned, sometimes without human intervention, and with little opportunity for redress. Users react to this by using coded, oblique or metaphorical language, by taking steps to conceal their identity such as the use of multiple accounts, raising questions about who the real originating author of a post actually is. The topic is of great interest to the research community, there have been a special issue of Big Data on computational propaganda, there is an ongoing shared task on Hyperpartisanship and extreme bias, a workshop on fact extraction and verification, and an upcoming hackaton on propaganda identification. The problem of detecting the use and the spreading of propaganda and extreme bias in society has been tackled in other fields such as network analysis, social sciences/data analysis and psychology. On the NLP side, efforts have focused on identifying if a full news article is real, fake, propaganda, or satire, if it is propaganda or not and on the detection of hyperpartisanship. The first NLP4IF workshop took place in Santa Fe, NM on August 20 (https://cbrew.github.io/nlp4if/) in conjunction with COLING 2018. It was dedicated to NLP methods that potentially contribute (either positively or negatively) to the free flow of information on the Internet, or to our understanding of the issues that arise in this area. The workshop was a great success. There were 25+ participants. With the generous support of NSF, we managed to bring three invited speakers and organize a panel on disinformation. We have NSF support for the next year workshop as well. A lot of work presented at the workshop was devoted to censorship in China. We hope the next workshop will attract researchers who work on a variety of topics that contribute to the free flow of information on the Internet.

NLP4IF Proceeding 2018

The workshop is supported by the U.S. National Science Foundation, award No. #1828199 Students can apply for travel grants. For more information, please, contact Anna Feldman (feldmana@montclair.edu)

The topics of interest include (but are not limited) to the following:

  • Censorship detection: detecting deleted or edited text; detecting blocked keywords/banned terms;
  • Censorship circumvention techniques: linguistically inspired countermeasure for Internet censorship such as keyword substitution, expanding coverage of existing banned terms, text paraphrasing, linguistic steganography, generating information morphs etc.;
  • Detection of self-censorship;
  • Identifying potentially censorable content;
  • Disinformation/Misinformation detection: fake news, fake accounts, rumor detection, etc.;
  • Identification of propaganda at document and fragment level
  • Identification of hate speech
  • (Comparative) analysis of the language of propagandistic and biased texts (this would replace the item “language of propaganda” in your CFP)
  • Automatic generation of persuasive content
  • Automatic debiasing of news content
  • Tools to facilitate the flagging, either automatic or manual, of propaganda and bias in social media
  • Automatic detection of coordinated propaganda campaigns such as the use of social bots, botnets, and water armies
  • Analysis of diffusion and consumption of propagandistic, hyperpartisan, and extremely biased content in social networks
  • Techniques to empirically measure Internet censorship across communication platforms;
  • Investigations on covert linguistic communication and its limits;
  • Identity and private information detection;
  • Passive and targeted surveillance techniques;
  • Ethics in NLP;
  • “Walled gardens”, personalization and fragmentation of the online public space;
  • We hope that our workshop will have a transformative impact on society by getting closer to achieving Internet freedom in countries where accessing and sharing of information are strictly controlled by censorship.

    -->

    Schedule Detail

    • 09:00-10:00

      Invited talk: Jennifer Pan (Stanford University): How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, Not Engaged Argument slides

    • event speaker

      10.00-10:30

      The effect of information controls on developers in China: An analysis of censorship in Chinese open source projects (Jeffrey Knockel, Masashi Crete-Nishihata and Lotus Ruan) slides,data

      By Jeff Knockel
    • event speaker

      10:30-11:00

      Coffee break

    • event speaker

      11:00-12:00

      Invited talk: Jed Crandall (University of New Mexico): How to Talk Dirty and Influence Machines slides

    • 12:00-12:30

      Linguistic Characteristics of Censorable Language on SinaWeibo (Kei Yin Ng, Anna Feldman, Jing Peng and Chris Leberknight) slides,data

    • event speaker

      12:30-2:00

      Lunch

    • event speaker

      2:00-3:00

      Invited Talk: Nancy Watzman (Dot Connector Studio): What do journalists really want from NLP researchers? How to help build trust in media and democracy by helping journalists make sense of big data slides

    • event speaker

      3:00-3:30

      Creative Language Encoding under Censorship (Heng Ji and Kevin Knight) [slides available upon request]

      By Heng Ji
    • event speaker

      3:30-4:00

      Coffee break

    • event speaker

      4:00-5:00

      Panel: NLP and Disinformation (Moderator: Chris Brew; Panelists: Jed Crandall, Heng Ji, Veronica Perez-Rosas, Nancy Watzman)

    Our Speakers

    Dr. Jennifer Pan (Stanford University, CA)

    Assistant Professor

    Dr. Jedidiah Crandall (University of New Mexico)

    Associate Professor

    Nancy Watzman (Dot Connector Studio)

    Managing Editor, Television Archive

    VENUE

    Asia World Expo

    Airport Expo Blvd, Chek Lap Kok, Hong Kong

    The NLP4IF Workshop is held in conjunction Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing 2019 that will take place in Hong Kong. EMNLP-IJCNLP 2019 will be held at the Asia World Expo from November 3rd through the 7th 2019

    Important Dates

    Workshop submission deadline: May 25, 2018 notification: June 20, 2018 camera-ready submission deadline: June 30, 2018 workshop date: August 20, 2018.

    Program

    According to the recent report produced by Freedom House (freedomhouse.org), an “independent watchdog organization dedicated to the expansion of freedom and democracy around the world”, Internet freedom declined in 2016 for the sixth consecutive year. 67% of all Internet users live in countries where criticism of the government, military, or ruling family are subject to censorship. Social media users face unprecedented penalties, as authorities in 38 countries made arrests based on social media posts over the past year. Globally, 27 percent of all internet users live in countries where people have been arrested for publishing, sharing, or merely “liking” content on Facebook. Governments are increasingly going after messaging apps like WhatsApp and Telegram, which can spread information quickly and securely. Various barriers exist to prevent citizens of a large number of countries to access information. Some involve infrastructural and economic barriers, others violations of user rights such as surveillance, privacy and repercussions for online speech and activities such as imprisonment, extralegal harassment or cyberattacks. Yet another area is limits on content, which involves legal regulations on content, technical filtering and blocking websites, (self-)censorship. Large internet providers are effective monopolies, and themselves have the power to use NLP techniques to control information flow. Users are suspended or banned, sometimes without human intervention, and with little opportunity for redress. Users react to this by using coded, oblique or metaphorical language, by taking steps to conceal their identity such as the use of multiple accounts, raising questions about who the real originating author of a post actually is. This workshop should bring together NLP researchers whose work contributes to the free flow of information on the Internet.
    Submissions should be written in English and anonymized with regard to the authors and/or their institution (no author-identifying information on the title page nor anywhere in the paper), including referencing style as usual. Authors should also ensure that identifying meta-information is removed from files submitted for review. Submissions must use the Word or LaTeX template files provided by COLING 2018 and conform to the format defined by the COLING 2018 style guidelines. * Long paper submission: up to 8 pages of content, plus 2 pages for references; final versions of long papers: one additional page: up to 9 pages with unlimited pages for references * Short paper submission: up to 4 pages of content, plus 2 pages for references; final version of short papers: up to 5 pages with unlimited pages for references PDF files must be submitted electronically via the [START submission system](https://www.softconf.com/coling2018/ws-NLP4IF/). The recommended style files are [available from the COLING repository](http://coling2018.org/wp-content/uploads/2018/01/coling2018.zip). Double submission policy: Parallel submission to other meetings or publications are possible but must be immediately notified to the workshop contact person. If accepted, withdrawals are only possible within two days after notification.
    To register, please go to http://coling2018.org/registration.
  • Alberto Barŕon-Cedeño, Scientist, Qatar Computing Research Institute. albarron@qf.org.qa
  • Chris Brew, Computational Research Scientist, Facebook: christopher.brew@gmail.com
  • Giovanni Da San Martino, Scientist, Qatar Computing Research Institute. gmartino@qf.org.qa
  • Anna Feldman, Professor of Linguistics and Computer Science at Montclair State University. feldmana@montclair.edu
  • Chris Leberknight, Associate Professor of Computer Science at Montclair State University. leberknightc@montclair.edu
  • Preslav Nakov, Senior Scientist, Qatar Computing Researach Institute. pnakov@qf.org.qa
  • Norah Abokhodair, Microsoft (US)
  • Banu Akdenizli, Northwestern University (Qatar)
  • Dyaa Albakour, Signal Media (UK)
  • Reda Alhajj, University of Calgary (Canada)
  • Jisun An, Qatar Computing Research Institute (Qatar)
  • Jed Crandall, University of New Mexico, NM (USA)
  • Kareem Darwish, Qatar Computing Research Institute (Qatar)
  • Gianmarco M. De Francisci, ISI Foundation (Italy)
  • Julio Gonzalo, UNED (Spain)
  • Phillip N. Howard, Oxford Internet Institute (UK)
  • Zubin Jelveh, The University of Chicago Crime Lab, New York (USA)
  • Heng Ji, Rensselaer Polytechnic Institute, NY (USA)
  • Jeffrey Knockell, The Citizen Lab, University of Toronto (Canada)
  • Haewoon Kwak, Qatar Computing Research Institute (Qatar)
  • Jure Leskovec, Stanford University, CA (USA)
  • Miguel Martinez, Signal Media, (UK)
  • Filippo Menczer, University of Indiana, IN (USA)
  • Ivan Meza, National Autonomous University of Mexico (Mexico)
  • Rada Mihalcea, University Michigan, MI (USA)
  • Prateek Mittal, Princeton University, NJ (USA)
  • Alessandro Moschitti, Amazon (USA)
  • Veronica Perez, University of Michigan, MI (USA)
  • Martin Potthast, University of Leipzig (Germany)
  • Hannah Rashkin, University of Washington (USA)
  • Paolo Rosso, Technical University of Valencia (Spain)
  • Anna Rumshisky, University of Massachusetts, Lowell, MI (USA)
  • Mahmood Sharif, Carnegie Mellon University, PA (USA)
  • Thamar Solorio, University of Houston (USA)
  • Benno Stein, Bauhaus University Weimar (Germany)
  • Denis Stukal, New York University (USA)
  • Yulia Tsvetkov, Carnegie Mellon University, PA (USA)
  • vJai an Bavel, New York University (USA)
  • Svetlana Volkova, Pacific Northwest National Laboratory
  • Henning Wachsmuth, University of Padderborn (Germany)
  • Brook Wu, New Jersey Institute of Technology, NJ (USA)
  • Mailing list for the workshop(https://groups.google.com/forum/#!forum/nlp4if)