Solving a LinkedIn Data Science SQL Interview Question

Hi everyone, today I go over how to tackle a medium difficulty interview question asked on a data science interview at LinkedIn!
👉 Subscribe to my channel: https://bit.ly/2xYkyUM
📧 Sign up for free weekly interview questions: https://www.interviewquery.com
🔑 Get professional help for your next interview: https://www.interviewquery.com/coaching
❓ Try a LinkedIn SQL interview question: https://www.interviewquery.com/questions/employee-salaries

More from Jay:
Follow me on Twitter: https://twitter.com/datasciencejay
Follow me on LinkedIn: https://www.linkedin.com/in/jay-feng-ab66b049/

Here’s a great guide on SQL concepts everyone needs to know for their interviews: https://www.interviewquery.com/blog-three-sql-questions-you-must-know-to-pass/

source by Data Science Jay

sql

Mourad ELGORMA

Fondateur de summarynetworks, passionné des nouvelles technologies et des métiers de Réseautique , Master en réseaux et système de télécommunications. ,j’ai affaire à Pascal, Delphi, Java, MATLAB, php …Connaissance du protocole TCP / IP, des applications Ethernet, des WLAN …Planification, installation et dépannage de problèmes de réseau informatique……Installez, configurez et dépannez les périphériques Cisco IOS. Surveillez les performances du réseau et isolez les défaillances du réseau. VLANs, protocoles de routage (RIPv2, EIGRP, OSPF.)…..Manipuler des systèmes embarqués (matériel et logiciel ex: Beaglebone Black)…Linux (Ubuntu, kali, serveur Mandriva Fedora, …). Microsoft (Windows, Windows Server 2003). ……Paquet tracer, GNS3, VMware Workstation, Virtual Box, Filezilla (client / serveur), EasyPhp, serveur Wamp,Le système de gestion WORDPRESS………Installation des caméras de surveillance ( technologie hikvision DVR………..). ,

23 réflexions sur “Solving a LinkedIn Data Science SQL Interview Question

  • juin 21, 2021 à 3:01
    Permalien

    Window Function of Rank on number of times posted partitioned by user_id, jobId and order by desc of number of times

    Filter for 1 thereby getting the max

    case when this max > 1 then more than once else less than once

    we can just use a max as well at the same hierarchy level

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    I guess this should work:
    I did not consider date coz that was not in specification, if a job id is posted multiple times, it can be on same day or any other day.
    Would appreciate your response if I missed anything :

    Select

    sum(case when totals= 1 then 1 else 0) as posted_once,

    sum(case when totals >1 then 1 else 0) as Posted_multiples from

    select userid, jobid, count(*) as totals from jobposting group by userid, jobid

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    I am trying to imitate how you explain your thinking, I think it is very important in a live interview! Great video jay.

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    This was my solution for number 1

    select

    p.job_poster,count(p.job_poster)

    from

    (

    select

    case when sum( case when x.pg_count>=2 then 1 else 0 end)=0 then 'Single Time Poster'

    else 'Posted > 1' end

    as job_poster

    from(

    select user_id,job_id, count(job_id) as pg_count

    from job_postings

    group by user_id,job_id

    ) as x

    group by x.user_id

    ) as p

    group by p.job_poster;

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    mysql> select * from job_postings;

    +——+——–+———+————-+

    | id | job_id | user_id | date_posted |

    +——+——–+———+————-+

    | 1 | 1 | 1 | 2020-12-23 |

    | 2 | 1 | 1 | 2020-12-24 |

    | 3 | 2 | 1 | 2020-12-25 |

    | 4 | 3 | 2 | 2020-12-25 |

    +——+——–+———+————-+

    4 rows in set (0.00 sec)

    mysql> SELECT

    -> SUM( CASE WHEN t.ratio = 1 THEN 1 ELSE 0 END ) AS num_user_posted_jobs_once,

    -> SUM( CASE WHEN t.ratio > 1 THEN 1 ELSE 0 END ) AS num_user_posted_jobs_multiply_times

    -> FROM

    -> (

    -> SELECT user_id, (COUNT(DISTINCT date_posted) / COUNT(DISTINCT job_id)) AS ratio

    -> FROM job_postings

    -> GROUP BY user_id

    -> ) AS t

    -> ;

    +—————————+————————————-+

    | num_user_posted_jobs_once | num_user_posted_jobs_multiply_times |

    +—————————+————————————-+

    | 1 | 1 |

    +—————————+————————————-+

    1 row in set, 1 warning (0.01 sec)

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    Is the whole problem#2 part missing? What about the "within 180 days of posting the same job again" constraint? I think you'll need a self join to do that.

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    dude you unnecessarily complicated it, could have used count(*) as frequency, a switch case, and a having clause that checks count(*) >1

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    with jobs as (
    select user_id, job_id, count(*) as num_posts
    from job_postings
    group by 1,2
    )
    select
    count(distinct case when num_posts > 1 then user_id end) as posted_at_least_one_job_multiple_times,
    (select count(distinct user_id) from job_postings) – count(distinct case when num_posts > 1 then user_id end) as posted_jobs_once
    from jobs

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    –MYSQL answer I came up with
    CREATE VIEW user_job_jobpost AS
    SELECT user_id, job_id, COUNT(job_id) AS num_job_posts FROM jobs
    GROUP BY user_id, job_id
    ORDER BY num_job_posts DESC;

    SELECT SUM(poster_status = "multiposter") AS count_multiposters,
    SUM(poster_status = "singleposter") AS count_singleposters FROM
    (SELECT user_id,
    IF(AVG(num_job_posts) > 1, "multiposter", "singleposter") AS poster_status
    FROM user_job_jobpost
    GROUP BY user_id) as user_poster_status;

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    I think I would disconnect because of the legacy system they probably have to many problems even with paperwork processes or just general reporting…

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    It would be great to hear your thoughts on this query. It might be bit short but the credit for crux of the code goes to Jay.

    With cte (
    Select user_id, job_id, count(distinct date_posted) as job_posted from job_posting
    Group by 1,2)

    Select sum(case when max(job_posted) over () = 1 then 1 end) as user_once,
    sum(case when max(job_posted) over() != 1 then 1 end) as user_multiple from cte
    Group by user_id;

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    Thank you for your help Jay. How would this query work?

    WITH user_job AS
    (
    SELECT user_id, job_id, COUNT(DISTINCT date_posted) AS #_posted
    FROM job_postings
    GROUP BY user_id, job_id
    )
    SELECT SUM(IF(MAX(#_posted) = 1, 1, 0 ) AS #_once, SUM(IF(MAX(#_posted) > 1, 1, 0) AS more_than_once
    FROM user_job
    GROUP BY user_id

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    A bit confused as you assumed all job ids are the same job which I would say the opposite. Each job Id is a unique job so you taking to average would not be right.

    Where in the problem statement does it say all job ids are the same for a given user?

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    Kindly please share your thoughts on the below query for the above requirement
    Thanks a lot

    SELECT

    SUM(CASE WHEN [totalNumOFJobsPosted] > 1 THEN 1 END) AS 'JOB POSTED MULTIPLE TIMES',

    SUM(CASE WHEN [totalNumOFJobsPosted] = 1 THEN 1 END) AS 'JOB POSTED ONLY ONCE'

    FROM

    (

    SELECT USER_ID,COUNT(DATE_POSTED) AS [totalNumOFJobsPosted] FROM [dbo].[JOB_POSTINGS]

    GROUP BY USER_ID

    ) AS T

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    I cannot believe this was a "medium" level question, I would have totally expected that to be a hard difficulty level! haha guess I need more practice…

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    I think this should work for the 1st question –

    with cte(user_id, freq) as

    (select user_id, case when sum(ct) = count(job_id) then 'Y' else 'N' end as freq

    from

    (select user_id, job_id, count(id) as ct

    from job_postings

    group by user_id, job_id)

    group by user_id)

    select count(case when freq = 'Y' then 1 else null end) as only_once ,

    count(case when freq = 'N' then 1 else null end) as more_than once

    from cte;

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    select user_id, job_id, count(*) as cnt

    from job_postings

    group by user_id, job_id

    having cnt = 1

    union

    select user_id, job_id, count(*) as cnt

    from job_postings

    group by user_id, job_id

    having cnt > 1

    Répondre
  • juin 21, 2021 à 3:01
    Permalien

    I think the query where your rolling up for getting final output is incorrect. You should be using count of userid in outer query not sum as aggregate.

    Instead of using avg. Using max could be a better option.

    Répondre

Laisser un commentaire