Want to make creations as awesome as this one?

Transcript

Dr Sunčica Hadžidedić

Saffron Zainchkovskaya

Recommender systems: exploring the effects of interpretable explainability on popular performance metrics

Issues:

  1. Single domain exploration [3,5,6,7,8,11]
  2. Lack of performance metrics available - mostly user study based
    1. (not as conclusive)
  3. Lack of the effect of addition of explainability on common metrics like RMSE, precision, and recall.
    1. Some papers ignore standard performance metrics
  4. Severe lack of human-interpretable explanations
    1. Knowledge graphs
    2. Cluster graphs
    3. Latent factors

Literature

  1. Understand Explainability in RSs Across Different Domains
    1. Identifying any domain-specific challenges
  2. Examine the effects of integrating explainability features on standard performance metrics, across multiple domains.
  3. Address the lack of knowledge surrounding the effect of differing explainability incorporation on a multitude of evaluation metrics.
  4. Transforming common explanability types to interpretable versions
    1. Use metrics to evaluate such explainations.

aims for this project

  1. Complete data collection and cleaning for the high impact domain
  2. Expand the RS & ERS to a front end application
  3. Implement the back end for the high impact domain (RS*& ERS*).
  4. Perform offline comparisons to assess the impact of domain on explainability and trust.
  1. Launch a complete mobile application for the RSs in the high impact domain.
  2. Execute a comprehensive user studyxplainability and trust in the RS.
  1. Complete data collection and cleaning for the low impact domain
  2. Develop a baseline recommendation system (RS) and a explinable RS (ERS) in the low impact domain.
  3. Conduct offline tests to compare the explainable and trustworthy RS against the baseline

INTERMEDIATE

ADVANCED

BASIC

OLd DELIVERABLES

  1. Complete data collection and cleaning for the high-impact domain
  2. Implement the back end for the high-impact domain (RS*& ERS*).
  3. Perform offline comparisons to assess the impact of the domain on explainability and performance metrics.
  1. Use GPT-3 API to process explanations produced by models to a more interpretable format.
  2. Test this novel approach by executing a user focus group on the types of explainability produced.
  1. Research the optimal techniques for explainability and evaluation
  2. Complete data collection and cleaning for the low-impact domain
  3. Develop a baseline recommendation system (RS) and an explainable RS (ERS) in the low-impact domain.
  4. Conduct offline tests to compare the explainable and trustworthy RS against the baseline

INTERMEDIATE

ADVANCED

BASIC

NEW DELIVERABLES

  1. Complete data collection and cleaning for the high-impact domain
  2. Implement the back end for the high-impact domain (RS*& ERS*).
  3. Perform offline comparisons to assess the impact of the domain on explainability and performance metrics.
  1. Use ChatGPT API to process explanations produced by models to a more user-friendly format
  2. Test this novel approach by executing a user focus group on the types of explainability produced.
  1. Research the optimal techniques for explainability and evaluation
  2. Complete data collection and cleaning for the low-impact domain, including text pre-processing using transformers (BERT)
  3. Develop a baseline recommendation system (RS) and an explainable RS (ERS) in the low-impact domain.
  4. Conduct offline tests to compare the explainable and trustworthy RS against the baseline

INTERMEDIATE

ADVANCED

BASIC

Progress ON DELIVERABLES

gpt 3 explanation with latent factors

  1. Thorough research of explainability metrics
    1. Collected extensive data on thids
  2. Tested GPT3 for producing explanations succesfully from latent factors
  3. Planned future steps
  1. Completed data collection for the low-impact domain
    1. Collected IMDB data using web scraping
  2. Completed data cleaning for the low-impact domain
    1. Processed IMDB data to retrieve descriptions, directors, and actors.
    2. Used BERT transformer to generate embeddings for descriptions.
  3. Completed data analysis for the low-impact domain
    1. Analysed the produced data, ready to be fed into the RS

Milestones achieved pt 1

Project Plan Overview

Comparison with Actual Progress

  1. Focused on back end before front end
  2. Revised the gantt chart for more appropriate completion
  3. Revised the time spent on data collection and analysis
  4. Revised front end
  5. Proposed new user study format
  6. Proposed the use of a novel technique of using ChatGPT API to process the recommendations.

Adjustments Made

  • Challenge: Data Accessibility Issues
    • accessing a robust dataset for the high-impact domain.
    • The initial attempt to scrape data from Glassdoor was met with technical roadblocks
    • IP address getting blocked repeatedly.
  • Solution: Innovative Data Collection Strategies
    • Researched and found external application
      • designed for dynamic web scraping, which helped bypass the IP blocking issue.
    • Thought of strategy to periodically change our IP addresses
      • allowing to continue data collection without interruptions. (currently working on)
    • As a backup, a base dataset/ synthetic data.

Challenges and Solutions

  1. Research Latent Factors method
  2. Complete LF for ML
  3. Create Explanations for low impact
  4. Evaluate on metrics
  5. Repeat for high impact

Next Steps

Recommender systems: exploring the effects of explainability on user trust

SAFFRON ZAINCHKOVSKAYA

THANK YOU!

SUCCESS METRICS- PRODUCT

Evaluation Criteria:

  1. Accuracy: How accurate is the RSs? Has the incorporation of explainability lowered the accuracy?
  2. Explainability: Is the incorporation of explainable recommendations within systems beneficial to users?
  3. User Trust: Does this incorporation enhance the trust users have with the system?
  4. Domains: Is the incorporation of explainations more important in a high/low impact domain?
  5. Project Completion: How effectively was the project completed within the stipulated timeframe and project scope? Does the completion status of the project reflect on the quality and reliability of the RSs?
      1. The project will be evaluated on a bi-weekly basis to ensure it is consistently on track.
      2. During this evaluation the project components below will be examined

SUCCESS METRICS- PROJECT

PROJECT EVALUATION

PLANNING

SUCCESS METRICS- PRODUCT

Evaluation Criteria:

  1. Accuracy: How accurate is the RSs? Has the incorporation of explainability lowered the accuracy?
  2. Explainability: Is the incorporation of explainable recommendations within systems beneficial to users?
  3. User Trust: Does this incorporation enhance the trust users have with the system?
  4. Domains: Is the incorporation of explainations more important in a high/low impact domain?
  5. Project Completion: How effectively was the project completed within the stipulated timeframe and project scope? Does the completion status of the project reflect on the quality and reliability of the RSs?
      1. The project will be evaluated on a bi-weekly basis to ensure it is consistently on track.
      2. During this evaluation the project components below will be examined

SUCCESS METRICS- PROJECT

PROJECT EVALUATION

PROJECT PROBLEM

  • Central problem addressed by this project
    • the lack of knowledge surrounding the effects of the incorporation of explainability within RSs
    • and the impact this has on a user’s trust level with the system
  • Also addresses the problem seen commonly within literature
    • where only one domain is being investigated