Subquery Help

,

I’m hoping someone can straighten me out on what seems to be a very simple subquery that doesn’t seem to be returning records in the manner I expect. I think I’m just having a bit of a brain-hiccup on setting up the BAQ properly to obtain the expected results.

Basically, I’m using the [Erp.Project] table linked to the [Erp.HDCase] table (which is set up as a subquery) in order to return the most recent case record associated with each individual project record. The [Erp.HDCase] subquery is sorted by [Erp.HDCase.LastUpdatedDate] and is selected as a “Top 1” query. The strategy here, as previously mentioned, is to grab only the most recent case record from that table.

Here is a general depiction of the BAQ configuration:

So, what’s the problem?

Well - I return all of the field values from the [Erp.Project] table with no issues, since that is part of the the “top level” query. However, I receive absolutely NO values from the [Erp.HDCase] table - UNLESS - I have ALL values entered from the group of fields I am requesting to be returned from that table. If I create a new case record with only a few of the fields filled in, the query never picks up that new case record. It seems that ALL of the values for the case record need to be filled in.

It’s fairly obvious that I’m stumbling on something pretty basic. Do you guys notice anything glaring that I’m missing either in the BAQ set-up, or in my expectations on the returned values?

I think your Top filtering will run BEFORE the Join so you’ll always be getting the Top (first) result back from the subquery.
Can you look at the Phase Builder and paste it here?
I think you shoud instead do a regular query (sans Top) and use a GroupBy agregate MAX to get the latest Case

SELECT Company, Project, Max(HDCase.HDCaseNum) from HDcase 
GROUP BY Company,Project

You can affect the order of the logical queries by changing the table order.

Thanks for the quick response, Jose.

Here’s the details on the query phrase:

select 
	[Project].[Company] as [Project_Company],
	[Project].[ProjectID] as [Project_ProjectID],
	[Project].[CofADate_c] as [Project_CofADate_c],
	[Project].[DeliveryDate_c] as [Project_DeliveryDate_c],
	[Project].[Model_c] as [Project_Model_c],
	[qryHDCase].[HDCase_HDCaseNum] as [HDCase_HDCaseNum],
	[qryHDCase].[HDCase_ActiveTaskID] as [HDCase_ActiveTaskID],
	[qryHDCase].[HDCase_CreatedDate] as [HDCase_CreatedDate],
	[qryHDCase].[HDCase_CreatedBy] as [HDCase_CreatedBy],
	[qryHDCase].[HDCase_LastUpdatedDate] as [HDCase_LastUpdatedDate],
	[qryHDCase].[HDCase_LastUpdatedBy] as [HDCase_LastUpdatedBy],
	[qryHDCase].[HDCase_HDCaseStatus] as [HDCase_HDCaseStatus],
	[qryHDCase].[HDCase_AirplaneAFTT_c] as [HDCase_AirplaneAFTT_c],
	[qryHDCase].[HDCase_AirplaneCountryOpp_c] as [HDCase_AirplaneCountryOpp_c],
	[qryHDCase].[HDCase_AirplaneEngineCycle_c] as [HDCase_AirplaneEngineCycle_c],
	[qryHDCase].[HDCase_AirplaneLandings_c] as [HDCase_AirplaneLandings_c],
	[qryHDCase].[HDCase_AirplaneOwner_c] as [HDCase_AirplaneOwner_c],
	[qryHDCase].[HDCase_AirplaneTailID_c] as [HDCase_AirplaneTailID_c]
from Erp.Project as Project
left outer join  (select top (1)  
	[HDCase].[Company] as [HDCase_Company],
	[HDCase].[HDCaseNum] as [HDCase_HDCaseNum],
	[HDCase].[ProjectID] as [HDCase_ProjectID],
	[HDCase].[ActiveTaskID] as [HDCase_ActiveTaskID],
	[HDCase].[CreatedDate] as [HDCase_CreatedDate],
	[HDCase].[CreatedBy] as [HDCase_CreatedBy],
	[HDCase].[LastUpdatedDate] as [HDCase_LastUpdatedDate],
	[HDCase].[LastUpdatedBy] as [HDCase_LastUpdatedBy],
	[HDCase].[HDCaseStatus] as [HDCase_HDCaseStatus],
	[HDCase].[AirplaneAFTT_c] as [HDCase_AirplaneAFTT_c],
	[HDCase].[AirplaneCountryOpp_c] as [HDCase_AirplaneCountryOpp_c],
	[HDCase].[AirplaneEngineCycle_c] as [HDCase_AirplaneEngineCycle_c],
	[HDCase].[AirplaneLandings_c] as [HDCase_AirplaneLandings_c],
	[HDCase].[AirplaneOwner_c] as [HDCase_AirplaneOwner_c],
	[HDCase].[AirplaneTailID_c] as [HDCase_AirplaneTailID_c]
from Erp.HDCase as HDCase
order by HDCase.LastUpdatedDate Desc)  as qryHDCase on 
	Project.Company = qryHDCase.HDCase_Company
	and Project.ProjectID = qryHDCase.HDCase_ProjectID
where (Project.CofADate_c is not null)

ok see if you look at the phrase your select top(1) runs in parenthesis first before the join. Which means you are getting the equivalent of the first record in that table and then Joining

So - the suggestion you had where I would need to create a simple query and a MAX aggregate (instead of using the subquery) is the way to go, perhaps?

… and thanks for the explanation of the order-of-evaluation there - that clarifies a lot for me, for sure.

Right the groupby will work cause you are geting the MAX CaseNum assuming these go in incremental order (they do)

Perfect. I’ll run through it and post the results back here in the thread.
Thanks for reaching out and lending a hand, Jose.

1 Like

Actually to further clarify your group by will get EVERY MAX() Case Number related to a project and then the join will filter it down to your current one

@josecgomez / @Chris_Conn :

After tinkering a bit on this, it seems I don’t quite understand subqueries as well as I originally thought (or, at least it appears to work differently in the BAQ Designer than it would in SQL).

So, I start with the [Erp.Project] table and set it to return distinct records with a specific date criteria. I then pull in the [Erp.HDCase] table and link it by [Company] and [ProjectID]. I create a calculated field to set up the MAX() aggregate on the [Erp.HDCase.LastUpdatedDate] field. That calculated field is now part of the query and seems to play nicely.

As soon as I pull in another field from [Erp.HDCase], the query requires that field to be part of the GROUP BY statement – and that’s when I lose integrity on the [Erp.Project] table… because I can’t really group on a field that can have unlimited values in it, as those values now become distinct GROUP BY data points.

Here a depiction:

I might be overthinking things a bit, perhaps.

IMHO, it’s pretty rare that you need to set the distinct setting in a query. Usually, if the joins are correct, they end up being distinct anyways. If you have doubles, it’s often a symptom of something wrong.

so this is where you would set your first part of you query as a sub query, then make a top level, bring in your subquery that has the rows you want, then join to the project table again.

4 Likes

Just relink the table (outside of the subquery)

Brandon: I’m a bit twisted around, admittedly. When you say, “… then join to the project table again”, are you saying that the [Erp.HDCase] subquery, that should be joined to [Erp.Project], should be joined back to [Erp.Project] a second time?

[“again” = a second time]

@Chris_Conn:

When you say, “Just relink the table…”, are you saying to relink (i.e. - link a second time) the [Erp.HDCase] table back to the subquery that already contains the [Erp.HDCase] table?

Correct.

MyHDCaseSubQuery -> HDCase on the key fields

I’ll work up some screen shots/example tomorrow. I’m pretty sure you are confused on what a subquery is?

@Banderson:

Before you do that… I’ve made some headway and I’d like to post what I have (by tomorrow morning) so that you can evaluate. I’ve basically used pointers literally from you, @Chris_Conn and @josecgomez throughout the thread.

I’d like to try and explain it from my perspective and maybe y’all can tell me if I’m off/on track. More to follow after I put everything together.

@Banderson - as I understand it, the subquery is essentially an embedded query that resides within a larger, base query and it is typically used to further modify the returned data derived from the base query (best I can describe in a nutshell - but I’m sure there’s a more potent definition out there).

What was tripping me up, mostly, was I had assumed that the additional fields I might want to have returned from the subquery should be defined in the “subquery object”. Why? Because that’s where I’m pulling in the [Erp.HDCase] table – directly to the “subquery object” (it’s all graphical in the BAQ Designer, so that’s why I’m referring to it as an “object”). However, that subquery object is nothing more than a limiter that further modifies what I need to have returned from the [Erp.HDCase] table (which is why @Banderson and @Chris_Conn had suggested to link the [Erp.HDCase] table back to the subquery) . I’m sure y’all might be able to explain this a bit better than what I’ve described here.

I used @josecgomez’s suggestion to employ the MAX() aggregate, as you’ll see shortly, but it took me a bit of time to realize that what makes that work is creating it within the subquery and then linking that over to the same field (non-calculated version, of course) in the [Erp.HDCase] table. So, I’m telling [Erp.HDCase] that I need it to show relative rows equal to the MAX last updated date. And you’re correct - I had no need to set the “DISTINCT” limiter on this at all.

Here’s the depiction:

End result is that it’s giving me precisely what I expect (and need).

I’m obviously going to require a bit more practice putting subqueries together in the BAQ Designer before I fully understand it 100%, but each of you provided me with a nice foundation to work from. Much appreciation to you guys.

I think the key breakdown for you was the same that occurs to many people when getting into aggregates\grouping. When using a query for an aggregate (max, sum, etc) - all fields in the query must be either:
An aggregate
Grouped

When grouping, you have to only use the fields to make your logical group, since it’s essentially trying to partition your data. These groups\aggregates are all that is returned to the next level in the query.

1 Like

Nice work @BA-Quest. I think if it like this. A subquery is just a custom made table. You grab the pieces you need to get your info, do whatever grouping or whatever, then after that, it’s just another table for you to use. Sometimes you end up rejoining to the same table you used to make it, but so what? That’s fine.

4 Likes