Chi-Chao Chang

Chi-Chao Chang

Palo Alto, California, United States
3K followers 500+ connections

Experience

  • Indeed Graphic

    Indeed

    San Francisco Bay Area

  • -

    Los Altos, California

  • -

    Palo Alto, California

  • -

    San Francisco Bay Area

  • -

    San Francisco, California, United States

  • -

    Dublin, Leinster, Ireland

  • -

    Redwood City, California

  • -

    East Palo Alto

  • -

    Menlo Park

  • -

    Palo Alto, CA

  • -

    San Francisco, CA

  • -

  • -

  • -

  • -

  • -

  • -

  • -

  • -

Education

Publications

  • Optimizing query rewrites for keyword-based advertising

    Electronic Commerce

    We consider the problem of query rewrites in the context ofpay-per-click search advertising. Given a three-layer graphconsisting of queries, query rewrites, and the correspondingads that can be served for the rewrites, we formulate a fam-ily of graph covering problems whose goals are to suggesta subset of ads with the maximum benefit by suggestingrewrites for a given query. We obtain constant-factor ap-proximation algorithms for these covering problems, undertwo versions of constraints and a…

    We consider the problem of query rewrites in the context ofpay-per-click search advertising. Given a three-layer graphconsisting of queries, query rewrites, and the correspondingads that can be served for the rewrites, we formulate a fam-ily of graph covering problems whose goals are to suggesta subset of ads with the maximum benefit by suggestingrewrites for a given query. We obtain constant-factor ap-proximation algorithms for these covering problems, undertwo versions of constraints and a realistic notion of ad ben-efit. We perform experiments on real data and show thatour algorithms are capable of outperforming a competitivebaseline algorithm in terms of the benefit of the rewrites.

    Other authors
    See publication
  • Simrank++: Query Rewriting through Link Analysis of the Graph

    VLDB

    We focus on the problem of query rewriting for sponsored search. We base rewrites on a historical click graph that records the ads that have been clicked on in response to past user queries. Given a query q, we first consider Sim- rank [7] as a way to identify queries similar to q, i.e., queries whose ads a user may be interested in. We argue that Sim- rank fails to properly identify query similarities in our ap- plication, and we present two enhanced versions of Simrank: one that exploits…

    We focus on the problem of query rewriting for sponsored search. We base rewrites on a historical click graph that records the ads that have been clicked on in response to past user queries. Given a query q, we first consider Sim- rank [7] as a way to identify queries similar to q, i.e., queries whose ads a user may be interested in. We argue that Sim- rank fails to properly identify query similarities in our ap- plication, and we present two enhanced versions of Simrank: one that exploits weights on click graph edges and another that exploits “evidence.” We experimentally evaluate our new schemes against Simrank, using actual click graphs and queries from Yahoo!, and using a variety of metrics. Our results show that the enhanced methods can yield more and better query rewrites.

    Other authors
    See publication
  • On the robustness of relevance measures with incomplete judgments

    SIGIR

    We investigate the robustness of three widely used IR relevance measures for large data collections with incomplete judgments. The relevance measures we consider are the bpref measure introduced by Buckley and Voorhees [7], the inferred average precision (infAP) introduced by Aslam and Yilmaz [4], and the normalized discounted cumulative gain (NDCG) measure introduced by Järvelin and Kekäläinen [8]. Our main results show that NDCG consistently performs better than both bpref and infAP. The…

    We investigate the robustness of three widely used IR relevance measures for large data collections with incomplete judgments. The relevance measures we consider are the bpref measure introduced by Buckley and Voorhees [7], the inferred average precision (infAP) introduced by Aslam and Yilmaz [4], and the normalized discounted cumulative gain (NDCG) measure introduced by Järvelin and Kekäläinen [8]. Our main results show that NDCG consistently performs better than both bpref and infAP. The experiments are performed on standard TREC datasets, under different levels of incompleteness of judgments, and using two different evaluation methods, namely, the Kendall correlation measures order between system rankings and pairwise statistical significance testing; the latter may be of independent interest.

    Other authors
    See publication
  • Searching with context

    WWW

    Contextual search refers to proactively capturing the information need of a user by automatically augmenting the user query with information extracted from the search context; for example, by using terms from the web page the user is currently browsing or a file the user is currently editing.We present three different algorithms to implement contextual search for the Web. The first, it query rewriting (QR), augments each query with appropriate terms from the search context and uses an…

    Contextual search refers to proactively capturing the information need of a user by automatically augmenting the user query with information extracted from the search context; for example, by using terms from the web page the user is currently browsing or a file the user is currently editing.We present three different algorithms to implement contextual search for the Web. The first, it query rewriting (QR), augments each query with appropriate terms from the search context and uses an off-the-shelf web search engine to answer this augmented query. The second, rank-biasing (RB), generates a representation of the context and answers queries using a custom-built search engine that exploits this representation. The third, iterative filtering meta-search (IFM), generates multiple subqueries based on the user query and appropriate terms from the search context, uses an off-the-shelf search engine to answer these subqueries, and re-ranks the results of the subqueries using rank aggregation methods.We extensively evaluate the three methods using 200 contexts and over 24,000 human relevance judgments of search results. We show that while QR works surprisingly well, the relevance and recall can be improved using RB and substantially more using IFM. Thus, QR, RB, and IFM represent a cost-effective design spectrum for contextual search.

    Other authors
    See publication
  • Y!Q: contextual search at the point of inspiration

    Proceedings of CIKM International Conference on Information and Knowledge Management, Bremen, Germany

  • An analysis of search engine switching behavior using click streams

    WWW (Special Interest Tracks and Posters)

    In this paper, we propose a simple framework to characterize the switching behavior between search engines based on click streams. We segment users into a number of categories based on their search engine usage during two adjacent time periods and construct the transition probability matrix across these usage categories. The principal eigenvector of the transposed transition probability matrix represents the limiting probabilities, which are proportions of users in each usage category at steady…

    In this paper, we propose a simple framework to characterize the switching behavior between search engines based on click streams. We segment users into a number of categories based on their search engine usage during two adjacent time periods and construct the transition probability matrix across these usage categories. The principal eigenvector of the transposed transition probability matrix represents the limiting probabilities, which are proportions of users in each usage category at steady state. We experiment with this framework using click streams focusing on two search engines: one with a large market share and the other with a small market share. The results offer interesting insights into search engine switching. The limiting probabilities provide empirical evidence that small engines can still retain its fair share of users over time.

    See publication
  • An Analysis of Search Engine Switching Behavior Using Click Streams

    Web and Internet Economics (WINE)

    In this paper, we propose a simple framework to characterize the switching behavior between search engines based on user click stream data. We cluster users into a number of categories based on their search engine usage pattern during two adjacent time periods and construct the transition probability matrix across these usage categories. The principal eigenvector of the transposed transition probability matrix represents the limiting probabilities, which are proportions of users in each usage…

    In this paper, we propose a simple framework to characterize the switching behavior between search engines based on user click stream data. We cluster users into a number of categories based on their search engine usage pattern during two adjacent time periods and construct the transition probability matrix across these usage categories. The principal eigenvector of the transposed transition probability matrix represents the limiting probabilities, which are proportions of users in each usage category at steady state. We experiment with this framework using real click stream data focusing on two search engines: one with a large market share and another with a small market share. The results offer interesting insights into search engine switching. The limiting probabilities provide empirical evidence that small engines can still retain its fair share of users over time.

    Other authors
    • Yun-Fang Juan
    See publication
  • Exploring cost-effective approaches to human evaluation of search engine relevance

    ECIR

    In this paper, we examine novel and less expensive methods for search engine evaluation that do not rely on document relevance judgments. These methods, described within a proposed framework, are motivated by the increasing focus on search results presentation, by the growing diversity of documents and content sources, and by the need to measure effectiveness relative to other search engines. Correlation analysis of the data obtained from actual tests using a subset of the methods in the…

    In this paper, we examine novel and less expensive methods for search engine evaluation that do not rely on document relevance judgments. These methods, described within a proposed framework, are motivated by the increasing focus on search results presentation, by the growing diversity of documents and content sources, and by the need to measure effectiveness relative to other search engines. Correlation analysis of the data obtained from actual tests using a subset of the methods in the framework suggest that these methods measure different aspects of the search engine. In practice, we argue that the selection of the test method is a tradeoff between measurement intent and cost.

    Other authors
    See publication
  • Javia: A Java interface to the virtual interface architecture

    Concurrency Practice and Experience

    The Virtual Interface (VI) architecture has become the industry standard for user‐level network interfaces. This paper presents the implementation and evaluation of Javia, a Java interface to the VI architecture. Javia explores two points in the design space. The first approach manages buffers in C and requires data copies between the Java heap and native buffers. The second approach relies on a Java‐level buffer abstraction that eliminates the copies in the first approach. Javia achieves an…

    The Virtual Interface (VI) architecture has become the industry standard for user‐level network interfaces. This paper presents the implementation and evaluation of Javia, a Java interface to the VI architecture. Javia explores two points in the design space. The first approach manages buffers in C and requires data copies between the Java heap and native buffers. The second approach relies on a Java‐level buffer abstraction that eliminates the copies in the first approach. Javia achieves an effective bandwidth of 80 Mbytes s−1 for 8 kbyte messages, which is within 1% of those achieved by C programs. Performance evaluations of parallel matrix multiplication and of the active messages communication protocol show that Javia can serve as an efficient building block for Java cluster applications. Copyright © 2000 John Wiley & Sons, Ltd.

    Other authors
    See publication

Patents

  • System and method for visualizing and relevance tuning search engine ranking functions

    Filed US 20090063464

    The present invention is directed towards system and methods for generating a visual representation indicating performance of a system capable of accepting one or more inputs and producing an ordered set of one or more responsive outputs. The method of the present invention comprises selecting one or more benchmark inputs and generating an ordered output set for each of the one or more benchmark inputs, a given output set comprising one or more output items responsive to a given benchmark…

    The present invention is directed towards system and methods for generating a visual representation indicating performance of a system capable of accepting one or more inputs and producing an ordered set of one or more responsive outputs. The method of the present invention comprises selecting one or more benchmark inputs and generating an ordered output set for each of the one or more benchmark inputs, a given output set comprising one or more output items responsive to a given benchmark query. One or pixels representing the one or more output items comprising the one or more outputs sets are generated, a given pixel containing a visual representation indicating a degree to which the output item represented by the pixel is relevant with respect to the benchmark input to which the output item is responsive. The one or more pixels representing the one or more output items comprising the one or more output sets are arranged in a circle in a manner indicative of the performance of the system.

    See patent
  • Simulation Framework For Evaluating Designs For Sponsored Search Markets

    Filed US US 2009/0171728 A1

    Method and apparatus for sponsored Internet-based search simulation.

    See patent

View Chi-Chao’s full profile

  • See who you know in common
  • Get introduced
  • Contact Chi-Chao directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Chi-Chao Chang

Add new skills with these courses